Essential bacterial genes and their use

ABSTRACT

Disclosed are 23 genes, termed “GEP” genes, found in  Streptococcus pneumonia,  which are located within operons that are essential for survival. Also disclosed is a related essential gene found in  Bacillus subtilis.  These genes and the polypeptides that they encode, as well as homologs thereof, can be used to identify antibacterial agents for treating bacterial infections such as streptococcal pneumonia.

CROSS-REFERENCE TO RELATED APPLICATION

[0001] This application claims priority under 35 U.S.C. §119 fromprovisional application U.S. Serial No. 60/070,116, filed Dec. 31, 1997,which is incorporated herein by reference in its entirety.

BACKGROUND OF THE INVENTION

[0002] The invention relates to essential bacterial genes and their usein identifying antibacterial agents.

[0003] Bacterial infections may be cutaneous, subcutaneous, or systemic.Opportunistic bacterial infections proliferate, especially in patientsafflicted with AIDS or other diseases that compromise the immune system.The bacterium Streptococcus pneumonia typically infects the respiratorytract and can cause lobar pneumonia, as well as meningitis, sinusitis,and other infections.

SUMMARY OF THE INVENTION

[0004] The invention is based on the discovery of 23 genes in thebacterium Streptococcus pneumoniae, and a related gene in the bacteriumBacillus subtilis, that are located within operons that are essentialfor survival. These 23 Streptococcus genes are referred to herein as“GEP genes” (which stands for general essential protein); forconvenience, the polypeptides encoded by these genes are referred toherein as “GEP polypeptides.” Each GEP gene is located within an operonthat contains a gene that is essential for survival of Streptococcuspneumoniae; the essential gene can be the GEP gene or another genelocated within the same operon. Bacterial operons contain several genesthat are related, e.g., with respect to function or biochemical pathway.Transcription of an operon leads to the production of a singletranscript in which multiple coding regions are linked. Thus, an operoncontaining one or more essential genes can be considered an “essentialoperon,” since disruption of expression of one gene located within theoperon will interfere with expression of the other genes in the operon.Each coding region of the transcript is separately translated into anindividual polypeptide by ribosomes that initiate translation atmultiple points along the transcript. Having identified one gene in theoperon, one can readily identify and sequence the other genes locatedwithin the operon.

[0005] The genes encoding the GEP polypeptides are useful moleculartools for identifying similar genes in pathogenic microorganisms, suchas pathogenic strains of Bacillus. In addition, the operons containinggenes encoding GEP polypeptides, and the polypeptides encoded by suchoperons, are useful targets for identifying compounds that areinhibitors of the pathogens in which the GEP polypeptides are expressed.Such inhibitors inhibit bacterial growth by being bacteriostatic (e.g.,inhibiting reproduction or cell division) or by being bacteriocidal(i.e., by causing cell death).

[0006] The invention, therefore, features an isolated polypeptideencoded by a nucleic acid located within an operon encoding a GEPpolypeptide, termed gep103, having the amino acid sequence set forth inSEQ ID NO: 1, or conservative variations thereof. An isolated operoncomprising a nucleic acid encoding gep103 also is included within theinvention. In addition, the invention includes an isolated nucleic acidof (a) an operon comprising the sequence of SEQ ID NO: 2, as depicted inFIG. 1, or degenerate variants thereof; (b) an operon comprising thesequence of SEQ ID NO: 2, or degenerate variants thereof, wherein T isreplaced by U; (c) nucleic acids complementary to (a) and (b); and (d)fragments of (a), (b), and (c) that are at least 15 base pairs in lengthand that hybridize under stringent conditions to genomic DNA encodingthe polypeptide of SEQ ID NO: 1. As described above for gep103, othernucleic acids and polypeptides encoded by nucleic acids located withinoperons encoding GEP polypeptides are included within the invention,including: (a) operons comprising the nucleic acids represented by theSEQ ID NOs. listed below, as depicted in the Figures listed below, ordegenerate variants thereof; (b) operons comprising the nucleic acidsrepresented by the SEQ ID NOs. listed below, wherein T is replaced by U;(c) nucleic acids complementary to (a) and (b); and (d) fragments of(a), (b), and (c) that are at least 15 base pairs in length and thathybridize under stringent conditions to genomic DNA encoding thepolypeptides represented by the SEQ ID NOs. listed below. TABLE 1 GEPnucleic acids and polypeptides SEQ ID ID NO. NO. OF SEQ OF THE GEP SEQID THE CODING NON-CODING NUCLEIC NO. OF STRAND OF STRAND OF ACID ORAMINO THE NUCLEIC THE NUCLEIC POLY- FIG. ACID ACID ACID PEPTIDE NO.SEQUENCE SEQUENCE SEQUENCE gep1033 1 1 2 3 gep1119 2 4 5 6 gep1122 3 7 89 gep1315 4 10 11 12 gep1493 5 13 14 15 gep1507 6 16 17 18 gep1511 7 1920 21 gep1518 8 22 23 24 gep1546 9 25 26 27 gep1551 10 28 29 30 gep156111 31 32 33 gep1580 12 34 35 36 gep1713 13 37 38 39 gep222 14 40 41 42gep2283 15 43 44 45 gep273 16 46 47 48 gep286 17 49 50 51 gep311 18 5253 54 gep3262 19 55 56 57 gep3387 20 58 59 60 gep47 21 61 62 63 gep61 2264 65 66 gep76 23 67 68 69

[0007] The invention also includes allelic variants (i.e., genesencoding isozymes) of the genes located within operons encoding the GEPpolypeptides listed above. For example, the invention includes a genethat encodes a GEP polypeptide but which gene includes one or more pointmutations, deletions, promotor variants, or splice site variants,provided that the resulting GEP polypeptide functions as a GEPpolypeptide (e.g., as determined in a conventional complementationassay). Also included within the invention are isolated operonscomprising a nucleic acid molecule containing the DNA sequence containedwithin the ATCC accession number ______, ______, ______, ______, ______,______, ______, ______, ______, ______, ______, ______, ______, ______,______, ______, ______, ______, ______, ______, ______, ______, or______, as well as polypeptides encoded by these nucleic acid molecules.

[0008] Identification of these GEP genes and the determination that theyare located within operons containing an essential gene allows homologsof the GEP genes to be found in other organisms strains ofStreptococcus. Also, orthologs of these genes can be identified in otherspecies (e.g., Bacillus sp.). While “homologs” are structurally similargenes contained within a species, “orthologs” are functionallyequivalent genes from other species (within or outside of a given genus,e.g., from Bacillus subtilis or E. coli). Such homologs and orthologsare expected to be located within operons that are essential forsurvival. Such homologous and orthologous genes and polypeptides can beused to identify compounds that inhibit the growth of the host organism(e.g., compounds that are bacteriocidal or bacteriostatic againstpathogenic strains of the organism). Homologous and orthologous genesand polypeptides that are essential for survival can serve as targetsfor identifying a broad spectrum of antibacterial agents.

[0009] An ortholog of gep1493, termed B-yneS, has been identified in B.subtilis and is essential for survival of B. subtilis. The amino acidsequence (SEQ ID NO: 70), coding sequence (SEQ ID NO: 71), andnon-coding sequence (SEQ ID NO: 72) of B-yneS is set forth in FIG. 24.As with the other polypeptides and genes disclosed herein, the B-yneSpolypeptide and gene can be used in the methods described herein toidentify antibacterial agents.

[0010] The term gep103 polypeptide or gene as used herein is intended toinclude the polypeptide and gene set forth in FIG. 1 herein, as well ashomologs of the sequences set forth in FIG. 1. Also encompassed by theterm gep103 gene are degenerate variants of the nucleic acid sequenceset forth in FIG. 1 (SEQ ID NO: 2). Degenerate variants of a nucleicacid sequence exist because of the degeneracy of the amino acid code;thus, those sequences that vary from the sequence represented by SEQ IDNO: 2, but which nonetheless encode a gep103 polypeptide are includedwithin the invention. Likewise, because of the similarity in thestructures of amino acids, conservative variations (as described herein)can be made in the amino acid sequence of the gep103 polypeptide whileretaining the function of the polypeptide (e.g., as determined in aconventional complementation assay). Other gep103 polypeptides and genesidentified in additional Streptococcus strains may be such conservativevariations or degenerate variants of the particular gep103 polypeptideand nucleic acid set forth in FIG. 1 (SEQ ID NOs: 1 and 2,respectively). The gep103 polypeptide and gene share at least 80%, e.g.,90%, sequence identity with SEQ ID NOs: 1 and 2, respectively.Regardless of the percent sequence identity between the gep103 sequenceand the sequence represented by SEQ ID NOs: 1 and 2, the gep103 genesand polypeptides encompassed by the invention are able to complement forthe lack of gep103 function (e.g., in a temperature-sensitive mutant) ina standard complementation assay. Additional gep103 genes that areidentified and cloned from additional Streptococcus strains, andpathogenic strains in particular, can be used to produce gep103polypeptides for use in the various methods described herein, e.g., foridentifying antibacterial agents. Likewise, the terms gep1119, gep1122,gep1315, gep1493, gep1507, gep1511, gep1518, gep1546, gep1551, gep1561,gep1580, gep1713, gep222, gep2283, gep273, gep286, gep311, gep3262,gep3387, gep47, gep61, and gep76 encompass homologs, conservativevariations, and degenerate variants of the sequences depicted in FIGS.2-23, respectively. Such homologs, conservative variations, anddegenerate variants also are included within the invention.

[0011] Since the various GEP genes described herein have been identifiedand shown to be located within operons that are essential for survival,the GEP genes and polypeptides encoded by nucleic acid sequences locatedwithin operons containing GEP genes and their homologs and orthologs canbe used to identify antibacterial agents. More specifically, thepolypeptides encoded by nucleic acid sequences located within operonscontaining GEP genes can be used, separately or together, in assays toidentify test compounds that bind to these polypeptides. Such testcompounds are expected to be antibacterial agents, in contrast tocompounds that do not bind to these GEP polypeptides. As describedherein, any of a variety of art-known methods can be used to assay forbinding of test compounds to the polypeptides. The invention includes,for example, a method for identifying an antibacterial agent where themethod entails: (a) contacting a polypeptide encoded by a nucleic acidsequence located within an operon containing a GEP gene, or homolog orortholog thereof, with a test compound; (b) detecting binding of thetest compound to the polypeptide or homolog or ortholog; and (c)determining whether a test compound that binds to the polypeptide orhomolog or ortholog inhibits growth of bacteria, relative to growth ofbacteria cultured in the absence of the test compound that binds to thepolypeptide or homolog or ortholog, as an indication that the testcompound is an antibacterial agent.

[0012] In various embodiments, the GEP polypeptide is derived from anon-pathogenic or pathogenic Streptococcus strain, such as Streptococcuspneumoniae, Streptococcus pyogenes, Streptococcus agalactiae,Streptococcus endocarditis, Streptococcus faecium, Streptococcus sangus,Streptococcus viridans, and Streptococcus hemolyticus. Suitableorthologs of the Streptococcus GEP genes can be derived from thebacterium Bacillus subtilis. The test compound can be immobilized on asubstrate, and binding of the test compound to the polypeptide orhomolog or ortholog can be detected as immobilization of the polypeptideor homolog or ortholog on the immobilized test compound, e.g., in animmunoassay with an antibody that specifically binds to the polypeptide.

[0013] If desired, the test compound can be a test polypeptide (e.g., apolypeptide having a random or predetermined amino acid sequence; or anaturally-occurring or synthetic polypeptide). Alternatively, the testcompound can be a nucleic acid, such as a DNA or RNA molecule. Inaddition, small organic molecules can be tested. The test compound canbe a naturally-occurring compound or it can be synthetically produced,if desired. Synthetic libraries, chemical libraries, and the like can bescreened to identify compounds that bind to the polypeptides. Moregenerally, binding of test compounds to the polypeptide or homolog orortholog can be detected either in vitro or in vivo. Regardless of thesource of the test compound, the polypeptides described herein can beused to identify compounds that are bactericidal or bacteriostatic to avariety of pathogenic or non-pathogenic strains.

[0014] In an exemplary method, binding of a test compound to apolypeptide encoded by a nucleic acid located within an operoncontaining a GEP gene can be detected in a conventional two-hybridsystem for detecting protein/protein interactions (e.g., in yeast ormammalian cells). Generally, in such a method, (a) the polypeptideencoded by a nucleic acid located within an operon containing a GEP geneis provided as a fusion protein that includes the polypeptide fused to(i) a transcription activation domain of a transcription factor or (ii)a DNA-binding domain of a transcription factor; (b) the test polypeptideis provided as a fusion protein that includes the test polypeptide fusedto (i) a transcription activation domain of a transcription factor or(ii) a DNA-binding domain of a transcription factor; and (c) binding ofthe test polypeptide to the polypeptide is detected as reconstitution ofa transcription factor. Homologs and orthologs of the GEP polypeptidescan be used in similar methods. Reconstitution of the transcriptionfactor can be detected, for example, by detecting transcription of agene that is operably linked to a DNA sequence bound by the DNA-bindingdomain of the reconstituted transcription factor (See, for example,White, 1996, Proc. Natl. Acad. Sci. 93:10001-10003 and references citedtherein and Vidal et al., 1996, Proc. Natl. Acad. Sci. 93:10315-10320).

[0015] In an alternative method, an isolated operon containing a nucleicacid molecule encoding a GEP polypeptide is used to identify a compoundthat decreases the expression of a GEP polypeptide in vivo. Suchcompounds can be used as antibacterial agents. To discover suchcompounds, cells that express a GEP polypeptide are cultured, exposed toa test compound (or a mixture of test compounds), and the level ofexpression or activity is compared with the level of GEP polypeptideexpression or activity in cells that are otherwise identical but thathave not been exposed to the test compound(s). Many standardquantitative assays of gene expression can be utilized in this aspect ofthe invention.

[0016] To identify compounds that modulate expression of a GEPpolypeptide (or homologous or orthologous sequence), the testcompound(s) can be added at varying concentrations to the culture mediumof cells that express a GEP polypeptide (or homolog or ortholog), asdescribed herein. Such test compounds can include small molecules(typically, non-protein, non-polysaccharide chemical entities),polypeptides, and nucleic acids. The expression of the GEP polypeptideis then measured, for example, by Northern blot PCR analysis or RNAseprotection analyses using a nucleic acid molecule of the invention as aprobe. The level of expression in the presence of the test molecule,compared with the level of expression in its absence, will indicatewhether or not the test molecule alters the expression of the GEPpolypeptide. Because the GEP polypeptides are expressed from operonsthat are essential for survival, test compounds that inhibit theexpression and/or function of the GEP polypeptide will inhibit growth ofthe cells or kill the cells.

[0017] Compounds that modulate the expression of the polypeptides of theinvention can be identified by carrying out the assays described hereinand then measuring the levels of the GEP polypeptides expressed in thecells, e.g., by performing a Western blot analysis using antibodies thatbind to a GEP polypeptide.

[0018] The invention further features methods of identifying from alarge group of mutants those strains that have conditional lethalmutations. In general, the gene and corresponding gene product aresubsequently identified, although the strains themselves can be used inscreening or diagnostic assays. The mechanism(s) of action for theidentified genes and gene products provide a rational basis for thedesign of antibacterial therapeutic agents. These antibacterial agentsreduce the action of the gene product in a wild type strain, andtherefore are useful in treating a subject with that type, or asimilarly susceptible type of infection by administering the agent tothe subject in a pharmaceutically effective amount. Reduction in theaction of the gene product includes competitive inhibition of the geneproduct for the active site of an enzyme or receptor; non-competitiveinhibition; disrupting an intracellular cascade path which requires thegene product; binding to the gene product itself, before or afterpost-translational processing; and acting as a gene product mimetic,thereby down-regulating the activity. Therapeutic agents includemonoclonal antibodies raised against the gene product.

[0019] Furthermore, the presence of the gene sequence in certain cells(e.g., a pathogenic bacterium of the same genus or similar species), andthe absence or divergence of the sequence in host cells can bedetermined, if desired. Therapeutic agents directed toward genes or geneproducts that are not present in the host have several advantages,including fewer side effects, and lower overall dosage.

[0020] The invention includes pharmaceutical formulations that include apharmaceutically acceptable excipient and an antibacterial agentidentified using the methods described herein. In particular, theinvention includes pharmaceutical formulations that containantibacterial agents that inhibit the growth of, or kill, pathogenicStreptococcus strains. Such pharmaceutical formulations can be used fortreating a Streptococcus infection in an organism. Such a method entailsadministering to the organism a therapeutically effective amount of thepharmaceutical formulation. In particular, such pharmaceuticalformulations can be used to treat streptococcal pneumonia in mammalssuch as humans and domesticated mammals (e.g., cows, pigs, dogs, andcats), and in plants. The efficacy of such antibacterial agents inhumans can be estimated in an animal model system well known to those ofskill in the art (e.g., mouse and rabbit model systems).

[0021] Also included within the invention are polyclonal and monoclonalantibodies that specifically bind to the various GEP polypeptidesdescribed herein (e.g., gep103). Such antibodies can facilitatedetection of GEP polypeptides in various Streptococcus strains. Theseantibodies also are useful for detecting binding of a test compound toGEP polypeptides (e.g., using the assays described herein). In addition,monoclonal antibodies that bind to GEP polypeptides are themselvesadequate antibacterial agents when administered to a mammal, as suchmonoclonal antibodies are expected to impede one or more functions ofGEP polypeptides.

[0022] As used herein, “nucleic acids” encompass both RNA and DNA,including genomic DNA and synthetic (e.g., chemically synthesized) DNA.The nucleic acid can be double-stranded or single-stranded. Wheresingle-stranded, the nucleic acid may be a sense strand or an antisensestrand. The nucleic acid may be synthesized using oligonucleotideanalogs or derivatives (e.g., inosine or phosphorothioate nucleotides).Such oligonucleotides can be used, for example, to prepare nucleic acidsthat have altered base-pairing abilities or increased resistance tonucleases.

[0023] An “isolated nucleic acid” is a DNA or RNA that is notimmediately contiguous with both of the coding sequences with which itis immediately contiguous (one on the 5′ end and one on the 3′ end) inthe naturally occurring genome of the organism from which it is derived.Thus, in one embodiment, an isolated nucleic acid includes some or allof the 5′ non-coding (e.g., promoter) sequences that are immediatelycontiguous to the coding sequence. The term therefore includes, forexample, a recombinant DNA that is incorporated into a vector, into anautonomously replicating plasmid or virus, or into the genomic DNA of aprokaryote or eukaryote, or which exists as a separate molecule (e.g., agenomic DNA fragment produced by PCR or restriction endonucleasetreatment) independent of other sequences. It also includes arecombinant DNA that is part of a hybrid gene encoding an additionalpolypeptide sequence. The term “isolated” can refer to a nucleic acid orpolypeptide that is substantially free of cellular material, viralmaterial, or culture medium (when produced by recombinant DNAtechniques), or chemical precursors or other chemicals (when chemicallysynthesized). Moreover, an “isolated nucleic acid fragment” is a nucleicacid fragment that is not naturally occurring as a fragment and wouldnot be found in the natural state. As used herein, the term “isolatednucleic acid molecule” includes an operon containing a contiguouscluster of linked sequences. “Isolated operons” are those operons thatare not naturally occurring and which are not associated with thesequences by which they are normally surrounded in a bacterial genome.

[0024] A nucleic acid sequence that is “substantially identical” to aGEP nucleotide sequence is at least 80% (e.g., 85%) identical to thenucleotide sequence of the nucleic acid sequences represented by the SEQID NOs listed in Table 1, as depicted in FIGS. 1-23. For purposes ofcomparison of nucleic acids, the length of the reference nucleic acidsequence will generally be at least 40 nucleotides, e.g., at least 60nucleotides or more nucleotides. Sequence identity can be measured usingsequence analysis software (e.g., Sequence Analysis Software Package ofthe Genetics Computer Group, University of Wisconsin BiotechnologyCenter, 1710 University Avenue, Madison, Wis. 53705).

[0025] The GEP polypeptides useful in practicing the invention include,but are not limited to, recombinant polypeptides and naturalpolypeptides. Also useful in the invention are nucleic acid sequencesthat encode forms of GEP polypeptides in which naturally occurring aminoacid sequences are altered or deleted. Preferred nucleic acids encodepolypeptides that are soluble under normal physiological conditions.Also within the invention are nucleic acids encoding fusion proteins inwhich a portion of a GEP polypeptide is fused to an unrelatedpolypeptide (e.g., a marker polypeptide or a fusion partner) to create afusion protein. For example, the polypeptide can be fused to ahexa-histidine tag to facilitate purification of bacterially expressedpolypeptides, or to a hemagglutinin tag to facilitate purification ofpolypeptides expressed in eukaryotic cells. The invention also includes,for example, isolated polypeptides (and the nucleic acids that encodethese polypeptides) that include a first portion and a second portion;the first portion includes, e.g., a GEP polypeptide, and the secondportion includes an immunoglobulin constant (Fc) region or a detectablemarker.

[0026] The fusion partner can be, for example, a polypeptide whichfacilitates secretion, e.g., a secretory sequence. Such a fusedpolypeptide is typically referred to as a preprotein. The secretorysequence can be cleaved by the host cell to form the mature protein.Also within the invention are nucleic acids that encode a GEPpolypeptide fused to a polypeptide sequence to produce an inactivepreprotein. Preproteins can be converted into the active form of theprotein by removal of the inactivating sequence.

[0027] The invention also includes nucleic acids that hybridize, e.g.,under stringent hybridization conditions (as defined herein) to all or aportion of the nucleotide sequences represented by the SEQ ID NOs.listed in Table 1, or their complements. The hybridizing portion of thehybridizing nucleic acids is typically at least 15 (e.g., 20, 30, or 50)nucleotides in length. The hybridizing portion of the hybridizingnucleic acid is at least 80%, e.g., at least 95%, or at least 98%,identical to the sequence of a portion or all of a nucleic acid encodinga GEP polypeptide or its complement. Hybridizing nucleic acids of thetype described herein can be used as a cloning probe, a primer (e.g., aPCR primer), or a diagnostic probe. Nucleic acids that hybridize to thenucleotide sequences represented by the SEQ ID NOs. listed in Table 1are considered “antisense oligonucleotides.” Also included within theinvention are ribozymes that inhibit the function of operons containingthe GEP genes of the invention, as determined, for example, in acomplementation assay.

[0028] Also useful in the invention are various cells, e.g., transformedhost cells, that contain a GEP nucleic acid described herein. A“transformed cell” is a cell into which (or into an ancestor of which)has been introduced, by means of recombinant DNA techniques, a nucleicacid encoding a GEP polypeptide. Both prokaryotic and eukaryotic cellsare included, e.g., bacteria, Streptococcus, Bacillus, and the like.

[0029] Also useful in the invention are genetic constructs (e.g.,vectors and plasmids) that include a nucleic acid of the invention whichis operably linked to a transcription and/or translation sequence toenable expression, e.g., expression vectors. By “operably linked” ismeant that a selected nucleic acid, e.g., a DNA molecule encoding a GEPpolypeptide, is positioned adjacent to one or more sequence elements,e.g., a promoter, which directs transcription and/or translation of thesequence such that the sequence elements can control transcriptionand/or translation of the selected nucleic acid.

[0030] The invention also features purified or isolated polypeptidesencoded by nucleic acids located within operons containing GEP genes, aslisted in Table 1. As used herein, both “protein” and “polypeptide” meanany chain of amino acids, regardless of length or post-translationalmodification (e.g., glycosylation or phosphorylation). Thus, the termsgep103 polypeptide, gep1119 polypeptide, gep1122 polypeptide, gep1315polypeptide, gep1493 polypeptide, gep1507 polypeptide, gep1511polypeptide, gep1518 polypeptide, gep1546 polypeptide, gep1551polypeptide, gep1561 polypeptide, gep1580 polypeptide, gep1713polypeptide, gep222 polypeptide, gep2283 polypeptide, gep273polypeptide, gep286 polypeptide, gep311 polypeptide, gep3262polypeptide, gep3387 polypeptide, gep47 polypeptide, gep61 polypeptide,and gep76 polypeptide include full-length, naturally occurring gep103,gep1119, gep1122, gep1315, gep1493, gep1507, gep1511, gep1518, gep1546,gep1551, gep1561, gep1580, gep1713, gep222, gep2283, gep273, gep286,gep311, gep3262, gep3387, gep47, gep61, and gep76 proteins,respectively, as well as recombinantly or synthetically producedpolypeptides that correspond to the full-length, naturally occurringproteins, or to a portion of the naturally occurring or syntheticpolypeptide.

[0031] A “purified” or “isolated” compound is a composition that is atleast 60% by weight the compound of interest, e.g., a GEP polypeptide orantibody. Preferably the preparation is at least 75% (e.g., at least 90%or 99%) by weight the compound of interest. Purity can be measured byany appropriate standard method, e.g., column chromatography,polyacrylamide gel electrophoresis, or HPLC analysis.

[0032] Preferred GEP polypeptides include a sequence substantiallyidentical to all or a portion of a naturally occurring GEP polypeptide,e.g., including all or a portion of the sequences shown in FIGS. 1-23.Polypeptides “substantially identical” to the GEP polypeptide sequencesdescribed herein have an amino acid sequence that is at least 80% (e.g.,85%, 90%, 95%, or 99%) identical to the amino acid sequence of the GEPpolypeptides represented by the SEQ ID NOs. listed in Table 1. Forpurposes of comparison, the length of the reference GEP polypeptidesequence will generally be at least 16 amino acids, e.g., at least 20 or25 amino acids.

[0033] In the case of polypeptide sequences that are less than 100%identical to a reference sequence, the non-identical positions arepreferably, but not necessarily, conservative substitutions for thereference sequence. Conservative substitutions typically includesubstitutions within the following groups: glycine and alanine; valine,isoleucine, and leucine; aspartic acid and glutamic acid; asparagine andglutamine; serine and threonine; lysine and arginine; and phenylalanineand tyrosine.

[0034] Where a particular polypeptide is said to have a specific percentidentity to a reference polypeptide of a defined length, the percentidentity is relative to the reference polypeptide. Thus, a polypeptidethat is 50% identical to a reference polypeptide that is 100 amino acidslong can be a 50 amino acid polypeptide that is completely identical toa 50 amino acid long portion of the reference polypeptide. It also mightbe a 100 amino acid long polypeptide which is 50% identical to thereference polypeptide over its entire length. Of course, otherpolypeptides also will meet the same criteria.

[0035] The invention also features purified or isolated antibodies thatspecifically bind to a GEP polypeptide. By “specifically binds” is meantthat an antibody recognizes and binds to a particular antigen, e.g., aGEP polypeptide, but does not substantially recognize and bind to othermolecules in a sample, e.g., a biological sample that naturally includesa GEP polypeptide.

[0036] In another aspect, the invention features a method for detectinga GEP polypeptide in a sample. This method includes: obtaining a samplesuspected of containing a GEP polypeptide; contacting the sample with anantibody that specifically binds to a GEP polypeptide under conditionsthat allow the formation of complexes of an antibody and the GEPpolypeptide; and detecting the complexes, if any, as an indication ofthe presence of a GEP polypeptide in the sample.

[0037] Also encompassed by the invention is a method of obtaining a generelated to (i.e., a functional homolog or ortholog of) a GEP gene. Sucha method entails obtaining a labeled probe that includes an isolatednucleic acid which encodes all or a portion of a GEP nucleic acid, or ahomolog or ortholog thereof; screening a nucleic acid fragment librarywith the labeled probe under conditions that allow hybridization of theprobe to nucleic acid fragments in the library, thereby forming nucleicacid duplexes; isolating labeled duplexes, if any; and preparing afull-length gene sequence from the nucleic acid fragments in any labeledduplex to obtain a gene related to the GEP gene.

[0038] The invention offers several advantages. For example, the methodsfor identifying antibacterial agents can be configured for highthroughput screening of numerous candidate antibacterial agents.

[0039] Unless otherwise defined, all technical and scientific terms usedherein have the same meaning as commonly understood by one of ordinaryskill in the art to which this invention belongs. Although methods andmaterials similar or equivalent to those described herein can be used inthe practice or testing of the present invention, suitable methods andmaterials are described herein. All publications, patent applications,patents, and other references mentioned herein are incorporated hereinby reference in their entirety. In the case of a conflict, the presentspecification, including definitions, will control. In addition, thematerials, methods, and examples are illustrative and are not intendedto limit the scope of the invention, which is defined by the claims.

[0040] Other features and advantages of the invention will be apparentfrom the following detailed description, and from the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

[0041]FIG. 1 is a representation of the amino acid and coding strand andnon-coding strand nucleic acid sequences of the gep103 polypeptide andgene from a Streptococcus pneumonia strain (SEQ ID NOs: 1, 2, and 3respectively).

[0042]FIG. 2 is a representation of the amino acid and coding strand andnon-coding strand nucleic acid sequences of the gep1119 polypeptide andgene from a Streptococcus pneumonia strain (SEQ ID NOs: 4, 5 and 6,respectively).

[0043]FIG. 3 is a representation of the amino acid and coding strand andnon-coding strand nucleic acid sequences of the gep1122 polypeptide andgene from a Streptococcus pneumonia strain (SEQ ID NOs: 7, 8, and 9,respectively).

[0044]FIG. 4 is a representation of the amino acid and coding strand andnon-coding strand nucleic acid sequences of the gep1315 polypeptide andgene from a Streptococcus pneumonia strain (SEQ ID NOs: 10, 11, and 12,respectively).

[0045]FIG. 5 is a representation of the amino acid and coding strand andnon-coding strand nucleic acid sequences of the gep1493 polypeptide andgene from a Streptococcus pneumonia strain (SEQ ID NOs: 13, 14, and 15,respectively).

[0046]FIG. 6 is a representation of the amino acid and coding strand andnon-coding strand nucleic acid sequences of the gep1507 polypeptide andgene from a Streptococcus pneumonia (SEQ ID NOs: 16, 17, and 18,respectively).

[0047]FIG. 7 is a representation of the amino acid and coding strand andnon-coding strand nucleic acid sequences of the gep1511 polypeptide andgene from a Streptococcus pneumonia (SEQ ID NOs: 19, 20, and 21,respectively).

[0048]FIG. 8 is a representation of the amino acid and coding strand andnon-coding strand nucleic acid sequences of the gep1518 polypeptide andgene from a Streptococcus pneumonia (SEQ ID NOs: 22, 23, and 24,respectively).

[0049]FIG. 9 is a representation of the amino acid and coding strand andnon-coding strand nucleic acid sequences of the gep1546 polypeptide andgene from a Streptococcus pneumonia strain (SEQ ID NOs: 25, 26, and 27,respectively).

[0050]FIG. 10 is a representation of the amino acid and coding strandand non-coding strand nucleic acid sequences of the gep1551 polypeptideand gene from a Streptococcus pneumonia strain (SEQ ID NOs: 28, 29, and30, respectively).

[0051]FIG. 11 is a representation of the amino acid and coding strandand non-coding strand nucleic acid sequences of the gep1561 polypeptideand gene from a Streptococcus pneumonia strain (SEQ ID NOs: 31, 32, and33, respectively).

[0052]FIG. 12 is a representation of the amino acid and coding strandand non-coding strand nucleic acid sequences of the gep1580 polypeptideand gene from a Streptococcus pneumonia strain (SEQ ID NOs: 34, 35, and36, respectively).

[0053]FIG. 13 is a representation of the amino acid and coding strandand non-coding strand nucleic acid sequences of the gep1713 polypeptideand gene from a Streptococcus pneumonia (SEQ ID NOs: 37, 38, and 39,respectively)

[0054]FIG. 14 is a representation of the amino acid and coding strandand non-coding strand nucleic acid sequences of the gep222 polypeptideand gene from a Streptococcus pneumonia (SEQ ID NOs: 40, 41, and 42,respectively).

[0055]FIG. 15 is a representation of the amino acid and coding strandand non-coding strand nucleic acid sequences of the gep2283 polypeptideand gene from a Streptococcus pneumonia (SEQ ID NOs: 43, 44, and 45,respectively).

[0056]FIG. 16 is a representation of the amino acid and coding strandand non-coding strand nucleic acid sequences of the gep273 polypeptideand gene from a Streptococcus pneumonia strain (SEQ ID NOs: 46, 47, and48, respectively).

[0057]FIG. 17 is a representation of the amino acid and coding strandand non-coding strand nucleic acid sequences of the gep286 polypeptideand gene from a Streptococcus pneumonia strain (SEQ ID NOs: 49, 50, and51, respectively).

[0058]FIG. 18 is a representation of the amino acid and coding strandand non-coding strand nucleic acid sequences of the gep311 polypeptideand gene from a Streptococcus pneumonia (SEQ ID NOs: 52, 53, and 54,respectively).

[0059]FIG. 19 is a representation of the amino acid and coding strandand non-coding strand nucleic acid sequences of the gep3262 polypeptideand gene from a Streptococcus pneumonia (SEQ ID NOs: 55, 56, and 57,respectively).

[0060]FIG. 20 is a representation of the amino acid and coding strandand non-coding strand nucleic acid sequences of the gep3387 polypeptideand gene from a Streptococcus pneumonia (SEQ ID NOs: 58, 59, and 60,respectively).

[0061]FIG. 21 are a representation of the amino acid and coding strandand non-coding strand nucleic acid sequences of the gep47 polypeptideand gene from a Streptococcus pneumonia strain (SEQ ID NOs: 61, 62, and63, respectively).

[0062]FIG. 22 is a representation of the amino acid and coding strandand non-coding strand nucleic acid sequences of the gep61 polypeptideand gene from a Streptococcus pneumonia strain (SEQ ID NOs: 64, 65, and66, respectively).

[0063]FIG. 23 is a representation of the amino acid and coding strandand non-coding strand nucleic acid sequences of the gep76 polypeptideand gene from a Streptococcus pneumonia strain (SEQ ID NOs: 67, 68, and69, respectively).

[0064]FIG. 24 is a representation of the amino acid and coding strandand non-coding strand nucleic acid sequences of the B-yneS polypeptideand gene from a Bacillus subtilis strain (SEQ ID NOs: 70, 71, and 72,respectively).

[0065]FIG. 25 is a schematic representation of the PCR strategy used toproduce DNA molecules used for targeted deletions of essential genes inStreptococcus pneumoniae.

[0066]FIG. 26 is a schematic representation of the strategy used toproduce targeted deletions of essential genes in Streptococcuspneumoniae.

DETAILED DESCRIPTION OF THE INVENTION

[0067] Identifying Streptococcus Genes in Essential Operons

[0068] As shown by the experiments described below, each of the GEPgenes is located within an operon that is essential for survival ofStreptococcus pneumonia. Streptococcus pneumonia is available from theATCC. To identify genes located within essential operons, mutants ofStreptococcus pneumonia were produced. In general, mutagenesis ofStreptococcus pneumonia can be accomplished using any of variousart-known methods.

[0069] In general, and for the examples set forth below, genes locatedwithin essential Streptococcus pneumonia operons can be identified usinggenes from a Streptococcus pneumonia RX1 genomic library, which wasproduced using standard methods (see Kim et al., Nucl. Acids. Res. 20:1083-1085 (1992) and Ausubel et al. (eds.), 1995, Current Protocols inMolecular Biology, (John Wiley & Sons, NY)). Genes in this Streptococcuslibrary were disrupted using a shuttle mutagenesis approach with thetransposon TnPho-A. Each disrupted gene then was tested to determinewhether it was located within an operon that is essential for survivalof Streptococcus pneumonia. In this method, 2 ml of LB brothsupplemented with chloramphenicol (10 μg/ml), MgSO₄ (10 mM) and maltose(0.2%) were inoculated with 50 μl of the Streptococcus pneumonia RX-1plasmid library. The culture was grown at 37° C. while shaking until theOD₆₅₀ of the culture reached 0.8 (approximately 2 hours). A 1 ml aliquotof TnPho-A-containing phage (10⁹ pfu/ml) was added to 1 ml of theStreptococcus culture, producing a ratio of approximately 10 phage to 1cell. The phage and cells were incubated at 37° C. for 30 minutes. A 4ml aliquot of LB broth, warmed to 37° C., then was added to thephage/cell mixture, and the mixture was incubated at 37° C., whileshaking, for 1 hour. The cells then were pelleted by centrifuging themat 3500 rpm in a Beckman tabletop centrifuge for 5 minutes.

[0070] The pelleted cells then were resuspended in 800 μl of LB broth,and a 200 μl aliquot of cells was plated onto each of four petri platescontaining LB agar supplemented with chloramphenicol (10 μg/ml),kanamycin (50 μg/ml), and erythromycin (300 μg/ml). The plates then wereincubated overnight at 37° C., and the number of colonies appearing onthe plates was counted. Approximately 18,000 colonies then were pooledand used to inoculate 50 ml of LB broth, which was incubated overnightat 37° C. Plasmid DNA from the culture then was extracted using a QiagenMIDI Prep Kit; other art-known extraction methods can be substituted.

[0071] The concentration of the extracted DNA was measured, and 100 ngof the DNA was transformed, by electroporation, into E. coli DH10B cells(Gibco BRL). A 1 ml aliquot of SOC broth then was added the transformedcells, and the cells were incubated at 37° C. for 1 hour before beingpelleted by centrifugation at 3500 RPM for 5 minutes. The cells thenwere resuspended in 200 μl of LB broth, and aliquots of 2, 20, and 50 μlwere plated onto petri plates containing LB agar and antibiotics asdescribed above. After incubating the plates overnight at 37° C., 93colonies were picked and used, individually, to inoculate 1.25 ml ofTerrific broth supplemented with chloramphenicol (10 μg/ml), kanamycin(50 μg/ml), and erythromycin (300 μg/ml). The cultures were incubated at37° G for approximately 20 hours, while shaking. The DNA from eachculture then was extracted, using a conventional alkaline lysis miniprepmethod.

[0072] The extracted DNA samples then were used, individually, totransform Streptococcus pneumonia cells in a 96-well microtitre format.The transposon promotes insertion of the mutagenized gene into thebacterial chromosome. Non-transforming clones indicate that the mutationwas within an operon containing an essential gene.

[0073] The non-transforming clones then were grown in 50 ml of Terrificbroth supplemented with chloramphenicol (10 μg/ml), kanamycin (50μg/ml), and erythromycin (300 μg/ml). DNA from these clones wasextracted and retransformed into Streptococcus pneumonia and plated onpetri dishes to confirm that they were non-transforming. The geneslocated within essential operons then were sequenced, using primers thathybridize to sequences of the transposon. The sequences of the primerswere:

[0074] 5′GCAGCCCGGTTTTCCAGAACAGG3′ (SEQ ID NO: 73) and

[0075] 5′GATTTAGCCCAGTCGGCCGCACG3′ (SEQ ID NO: 74).

[0076] In an alternative method, which also was used, the transposon Tn10 was used to disrupt genes in a Streptococcus pneumonia fosmidlibrary, which was produced using standard methods. A 50 ml aliquot ofTBMM broth supplemented with chloramphenicol (10 μg/ml), MgSO₄ (10 mM),and maltose (0.2%) were inoculated with a single fosmid colony from thefosmid library, and the cultures were grown overnight at 37° C. Thecells then were pelleted and resuspended in 5 ml of LB brothsupplemented with chloramphenicol (10 μg/ml), MgSO₄ (10 mM), and maltose(0.2%). A 100 μl aliquot of the cells then was mixed with 100 μl of Tn10phage lysate (10¹⁰ pfu/ml), and the mixture was incubated at roomtemperature for 15 minutes and then incubated at 37° C. for 15 minutes.

[0077] A 5 ml aliquot of LB broth supplemented with IPTG (1 mM) andsodium citrate (50 mM) and warmed to 37° C. then was added to thecell/phage mixture. After incubating the cell/phage mixture at 37° C.,while shaking, the cells were pelleted and resuspended in 800 μl of LBbroth. The cells then were plated onto 4 plates of LB agar supplementedwith chloramphenicol (10 μg/ml) and erythromycin (300 μg/ml). Afterincubating the cells overnight at 37° C., at least 10,000 of theresulting colonies were used to inoculate 50 ml of LB broth. DNA thenwas extracted and quantified using standard methods, and 100 ng of DNAwere used to transform E. coli DH10B cells (Gibco BRL) viaelectroporation. After adding 1 ml of SOC broth to the cells, the cellswere incubated at 37° C. for 1 hour. The cells then were pelleted andsuspended in 200 μl LB broth, and aliquots of 2, 20, and 50 μl wereplated onto LB agar supplemented with chloramphenicol (10 μg/ml),kanamycin (50 μg/ml), and erythromycin (300 μg/ml). The plates then wereincubated overnight at 37° C., and 93 colonies were picked and used toinoculate 1.25 ml of Terrific broth supplemented with chloramphenicol(10 μg/ml), kanamycin (50 μg/ml) and erythromycin (300 μg/ml). Thesecultures were incubated for approximately 20 hours, while shaking, andthe DNA was isolated using a standard miniprep method. The extracted DNAthen was used to transform Streptococcus pneumonia, and the geneslocated within essential operons were sequenced as described above. Thesequences of the primers used for sequencing were:

[0078] 5′CCGCCATTCTTTGCTGTTTCG3′ (SEQ ID NO: 75) and

[0079] 5′TTACACGTTACTAAAGGGAATG3′ (SEQ ID NO: 76).

[0080] Identification of the gep1493, gep1507, gep1546, gep273, gep286,and gep76 Genes as Essential Genes

[0081] As shown by the experiments described below, the gep1493,gep1507, gep1546, gep273, gep286, and gep76 genes each have been shownto be essential for survival of Streptococcus pneumoniae. Each of thegep1493, gep1507, gep1546, gep273, gep286, and gep76 genes has beenidentified as essential by creating a targeted deletion of each gene,separately, in Streptococcus pneumoniae.

[0082] Each of the gep1493, gep1507, gep1546, gep273, gep286, and gep76genes was, separately, replaced with a nucleic acid sequence conferringresistance to the antibiotic erythromycin (an “erm” gene). Other geneticmarkers can be used in lieu of this particular antibiotic resistancemarker. Polymerase chain reaction (PCR) amplification was used to make atargeted deletion in the Streptococcus genomic DNA, as shown in FIG. 25.Several PCR reactions were used to produce the DNA molecules needed tocarry out target deletion of the genes of interest. First, using primers5 and 6, an erm gene was amplified from pIL252 from B. subtilis(available from the Bacillus Genetic Stock Center, Columbus, Ohio).Primer 5 consists of 21 nucleotides that are identical to the promoterregion of the erm gene and complementary to Sequence A. Primer 5 has thesequence 5′GTG TTC GTG CTG ACT TGC ACC3′ (SEQ ID NO: 77). Primer 6consists of 21 nucleotides that are complementary to the 3′ end of theerm gene. Primer 6 has the sequence 5′GAA TTA TTT CCT CCC GTT AAA3′ (SEQID NO: 78). PCR amplification of the erm gene was carried out under thefollowing conditions: 30 cycles of 94° C. for 1 minute, 55° C. for 1minute, and 72° C. for 1.5 minutes, followed by one cycle of 72° C. for10 minutes.

[0083] In the second and third PCR reactions, sequences flanking thegene of interest were amplified and produced as hybrid DNA moleculesthat also contained a portion of the erm gene. The second reactionproduced a double-stranded DNA molecule (termed “Left FlankingMolecule”) that includes sequences upstream of the 5′ end of the gene ofinterest and the first 21 nucleotides of the erm gene. As shown in FIG.25, this reaction utilized primer 1, which is 21 nucleotides in lengthand identical to a sequence that is located approximately 500 bpupstream of the translation start site of the gene of interest. Primers1 and 2 are gene-specific and include the sequences 5′CTC CGT GAA GTCCAC CTG AT3′ (SEQ ID NO: 79) and 5′GGT GCA AGT CAG CAC GAA CAC GCG ACATAG GTT CCA GTT AGG3′ (SEQ ID NO: 80), respectively, for gep1493. Primer2 is 42 nucleotides in length, with 21 of the nucleotides at the 3′ endof the primer being complementary to the 5′ end of the sense strand ofthe gene of interest. The 21 nucleotides at the 5′ end of the primerwere identical to Sequence A and are therefore complementary to the 5′end of the erm gene. Thus, PCR amplification using primers 1 and 2produced the left flanking DNA molecule, which is a hybrid DNA moleculecontaining a sequence located upstream of the gene of interest and 21base pairs of the erm gene, as shown in FIG. 25.

[0084] The third PCR reaction was similar to the second reaction, butproduced the right flanking DNA molecule, shown in FIG. 25. The rightflanking DNA molecule contains 21 base pairs of the 3′ end of the ermgene, a 21 base pair portion of the 3′ end of the gene of interest, andsequences downstream of the gene of interest. This right flanking DNAmolecule was produced with gene-specific primers 3 and 4. For gep 1493,primers 3 and 4 included the sequences 5′TTT AAC GGG AGG AAA TAA TTC CCATAT CGT GGC TCC TGA AT 3′ (SEQ ID NO: 81) and 5′TAA AGC CCT CAT GTC GAACC3′ (SEQ ID NO: 82), respectively. Primer 3 is 42 nucleotides; the 21nucleotides at the 5′ end of Primer 3 are identical to Sequence B andtherefore are identical to the 3′ end of the erm gene. The 21nucleotides at the 3′ end of Primer 3 are identical to the 3′ end of thegene of interest. Primer 4 is 21 nucleotides in length and iscomplementary to a sequence located approximately 500 bp downstream ofthe gene of interest. As discussed above, primers 1-4 are gene-specific,and the sequences disclosed above were used for gep1493. Gene-specificprimers were used to identify the other essential genes describedherein, as shown in Table 2. TABLE 2 Primers Used in IdentifyingEssential Genes Gene Primer 1 Primer 2 Primer 3 Primer 4 gep14935′CTCCGTGAAGTCC 5′GGTGCAAGTCAGCA 5′TTTAACGGGAGG 5′TTGGCAAGAAGG ACCTGAT3′CGAACACTGCTCGCGT AAATAATTCGGGGA CAGAGAAT3′ (SEQ ID NO:79) AGATTGATTTG3′TTGAACCTAACCCA (SEQ ID NO:82) (SEQ ID NO:80) T3′ (SEQ ID NO:81) gep15075′GCATGAGAAACCC 5′GGTGCAAGTCAGCA 5′TTTAACGGGAGG 5′TAAAGCCCTCAT AGTCTCC3′CGAACACGCGACATAG AAATAATTCCCATA GTCGAACC3′ (SEQ ID NO:83) GTTCCAGTTAGG3′TCGTGGCTCCTGAA (SEQ ID NO:86) (SEQ ID NO:84) T3′ (SEQ ID NO:85) gep15465′CAGTGACGATACA 5′GGTGCAAGTCAGCA 5′TTTAACGGGAGG 5′CCAGCAAAGGAAGATGAAGAA3′ CGAACACGATGCTGGC AAATAATTCGTCGC AACCGATA3′ (SEQ ID NO:87)TTCGTTGAGTG3′ GACTCCTAGCCATA (SEQ ID NO:90) (SEQ ID NO:88) C3′ (SEQ IDNO:89) gep273 5′GGTCAGTGACAGC 5′GGTGCAAGTCAGCA 5′TTTAACGGGAGG5′CCCATAACCGTA AGCAGAT3′ CGAACACGGCCTTGGA AAATAATTCCCGCT TCACCTGG3′ (SEQID NO:91) AAAAAGACCAT3′ TAAATTCTGCCAAT (SEQ ID NO:94) (SEQ ID NO:92) C3′(SEQ ID NO:93) gep286 5′CGGAACGGCTATG 5′GGTGCAAGTCAGCA 5′TTTAACGGGAGG5′TCGCCCTACTTT AAAAAAA3′ CGAACACACGACGAAA AAATAATTCTGGTA TCGTATGC3′ (SEQID NO:95) GGCAACCATAC3′ TGGGGGTTGATGAA (SEQ ID NO:98) (SEQ ID NO:96) G3′(SEQ ID NO:97) gep76 5′AGCGATATTAGTG 5′GGTGCAAGTCAGCA 5′TTTAACGGGAGG5′GGGATTGTCACG CGGGAGA3′ CGAACACCAGCAATTT AAATAATTCCTGGG GTAAAACC3′ (SEQID NO:99) TGTCATCAGTCG3′ GTAATGGAGCACAG (SEQ ID NO:102) (SEQ ID NO:100)T3′ (SEQ ID NO:101)

[0085] PCR amplification of the left and right flanking DNA moleculeswas carried out, separately, in 50 μl reaction mixtures containing: 1 μlStreptococcus pneumoniae (RX1) DNA (0.25 μg), 2.5 μl Primer 1 or Primer4 (10 pmol/μl), 2.5 μl Primer 2 or Primer 3 (20 pmol/μl), 1.2 μl amixture dNTPS (10 mM each), 37 μl H₂O, 0.7 μl Taq polymerase (5 U/μl),and 5 μl 10×Taq polymerase buffer (10 mM Tris, 50 mM KCl, 2.5 mM MgCl₂).The left and right flanking DNA molecules were amplified using thefollowing PCR cycling program: 95° C. for 2 minutes; 72° C. for 1minute; 94° C. for 30 seconds; 49° C. for 30 seconds; 72° C. for 1minute; repeating the 94° C., 49° C., and 72° C. incubations 30 times;72° C. for 10 minutes and then stopping the reactions. A 15 μl aliquotof each reaction mixture then was electrophoresed through a 1.2% lowmelting point agarose gel in TAE buffer and then stained with ethidiumbromide. Fragments containing the amplified left and right flanking DNAmolecules were excised from the gel and purified using the QIAQUICK™ gelextraction kit (Qiagen, Inc.) Other art-known methods for amplifying andisolating DNA can be substituted. The flanking left and right DNAfragments were eluted into 30 μl TE buffer at pH 8.0.

[0086] The amplified erm gene and left and right flanking DNA moleculeswere then fused together to produce the fusion product, as shown in FIG.25. The fusion PCR reaction was carried out in a volume of 50 μlcontaining: 2 μl of each of the left and right flanking DNA moleculesand the erm gene PCR product; 5 μl of 10× buffer; 2.5 μl of Primer 1 (10pmol/μl); 2.5 μl of Primer 4 (10 pmol/μl), 1.2 μl dNTP mix (10 mM each)32 μl H₂O, and 0.7 μl Taq polymerase. The PCR reaction was carried outusing the following cycling program: 95° C. for 2 minutes; 72° C. for 1minute; 94° C. for 30 seconds, 48° C. for 30 seconds; 72° C. for 3minutes; repeat the 94° C., 48° C. and 72° C. incubations 25 times; 72°C. for 10 minutes. After the reaction was stopped, a 12 μl aliquot ofthe reaction mixture was electrophoresed through an agarose gel toconfirm the presence of a final product of approximately 2 kb.

[0087] A 5 μl aliquot of the fusion product was used to transform S.pneumoniae grown on a medium containing erythromycin in accordance withstandard techniques. As shown in FIG. 26, the fusion product and the S.pneumoniae genome undergo a homologous recombination event so that theerm gene replaces the chromosomal copy of the gene of interest, therebycreating a gene knockout. Disruption of an essential gene results in nogrowth on a medium containing erythromycin. Using this gene knockoutmethod, the gep1493, gep1507, gep1546, gep273, gep286, and gep76 geneswere each identified as being essential for survival.

[0088] Identification of Homologs and Orthologs of GEP Polypeptides

[0089] Having shown that the various GEP genes are essential or locatedwithin operons that are essential for survival of Streptococcus, it canbe expected that homologs and orthologs of the polypeptides encoded bythese genes, when present in other organisms, for example B. subtilis,are essential or located within operons that are essential for survivalof that organism as well, and therefore are useful targets foridentifying antibacterial agents. Using the sequences of the GEPpolypeptides identified in Streptococcus, homologs and orthologs ofthese polypeptides can be identified in other organisms. For example,the coding sequences of the GEP nucleic acids can be used to search theGenBank database of nucleotide sequences to identify homologs ororthologs that are expressed from essential operons in other organisms.Sequence comparisons can be performed using the Basic Local AlignmentSearch Tool (BLAST) (Altschul et al., J. Mol. Biol., 215:403-410 1990).The percent sequence identity shared by the GEP polypeptides and theirhomologs or orthologs can be determined using the GAP program from theGenetics Computer Group (GCG) Wisconsin Sequence Analysis Package(Wisconsin Package Version 9.0, GCG; Madison, Wis.). The followingparameters are suitable: gap creation penalty, 12 (protein) 50 (DNA);gap extension penalty, 4 (protein) 3 (DNA). Typically, the GEPpolypeptides and their homologs share at least 25% (e.g., at least 40%)sequence identity. Typically, the DNA sequences encoding GEPpolypeptides and their homologs share at least 35% (e.g., at least 45%)sequence identity. To confirm that the homologs or orthologs of the GEPpolypeptides are expressed from operons that are essential for survivalof bacteria, the operon encoding each of the homologs or orthologs canbe, separately, deleted from the genome of the host organism.

[0090] Identification of Essential Operons in Additional StreptococcusStrains

[0091] Now that the various GEP genes have been identified as beinglocated within operons that are essential for survival, these genes, orfragments thereof, can be used to detect homologous or orthologous genesin other organisms. In particular, these genes can be used to analyzevarious pathogenic and non-pathogenic strains of bacteria. Fragments ofa nucleic acid (DNA or RNA) encoding a GEP polypeptide or homolog orortholog (or sequences complementary thereto) can be used as probes inconventional nucleic acid hybridization assays of pathogenic bacteria.For example, nucleic acid probes (which typically are 8-30, or usually15-20, nucleotides in length) can be used to detect GEP genes orhomologs or orthologs thereof in art-known molecular biology methods,such as Southern blotting, Northern blotting, dot or slot blotting, PCRamplification methods, colony hybridization methods, and the like.Typically, an oligonucleotide probe based on the nucleic acid sequencesdescribed herein, or fragments thereof, is labeled and used to screen agenomic library constructed from mRNA obtained from a Streptococcus orbacterial strain of interest. A suitable method of labeling involvesusing polynucleotide kinase to add ³²P-labeled ATP to theoligonucleotide used as the probe. This method is well known in the art,as are several other suitable methods (e.g., biotinylation and enzymelabeling).

[0092] Hybridization of the oligonucleotide probe to the library, orother nucleic acid sample, typically is performed under stringent tohighly stringent conditions. Nucleic acid duplex or hybrid stability isexpressed as the melting temperature or T_(m), which is the temperatureat which a probe dissociates from a target DNA. This melting temperatureis used to define the required stringency conditions. If sequences areto be identified that are related and substantially identical to theprobe, rather than identical, then it is useful to first establish thelowest temperature at which only homologous hybridization occurs with aparticular concentration of salt (e.g., SSC or SSPE). Then, assumingthat 1% mismatching results in a 1° C. decrease in the T_(m), thetemperature of the final wash in the hybridization reaction is reducedaccordingly (for example, if sequences having ≧95% identity with theprobe are sought, the final wash temperature is decreased by 5° C.). Inpractice, the change in T_(m) can be between 0.5° and 1.5° C. per 1%mismatch.

[0093] As used herein, highly stringent conditions refer tohybridization at 68° C. in 5×SSC/5×Denhardt's solution/1.0% SDS, andwashing in 0.2×SSC/0.1% SDS at 42° C. Stringent conditions refer towashing in 3×SSC at 42° C. The parameters of salt concentration andtemperature can be varied to achieve the optimal level of identitybetween the probe and the target nucleic acid. Additional guidanceregarding such conditions is readily available in the art, for example,by Sambrook et al., 1989, Molecular Cloning, A Laboratory Manual, ColdSpring Harbor Press, N.Y.; and Ausubel et al. (eds.), 1995, CurrentProtocols in Molecular Biology, (John Wiley & Sons, N.Y.) at Unit 2.10.

[0094] In one approach, libraries constructed from pathogenic ornon-pathogenic Streptococcus or bacterial strains can be screened. Forexample, such strains can be screened for expression of GEP genes byNorthern blot analysis. Upon detection of transcripts of the GEP genesor homologs or orthologs thereof, libraries can be constructed from RNAisolated from the appropriate strain, utilizing standard techniques wellknown to those of skill in the art. Alternatively, a total genomic DNAlibrary can be screened using an GEP gene probe (or a probe directed toa homolog or ortholog thereof).

[0095] New gene sequences can be isolated, for example, by performingPCR using two degenerate oligonucleotide primer pools designed on thebasis of nucleotide sequences within the GEP genes, or their homologs ororthologs, as depicted herein. The template for the reaction can be DNAobtained from strains known or suspected to express a GEP allele or anallele of a homolog or ortholog thereof. The PCR product can besubcloned and sequenced to ensure that the amplified sequences representthe sequences of a new GEP nucleic acid sequence, or a sequence of ahomolog or ortholog thereof.

[0096] Synthesis of the various GEP polypeptides or their homologs ororthologs (or an antigenic fragment thereof) for use as antigens, or forother purposes, can readily be accomplished using any of the variousart-known techniques. For example, a polypeptide or homolog or orthologthereof, or an antigenic fragment(s), can be synthesized chemically invitro, or enzymatically (e.g., by in vitro transcription andtranslation). Alternatively, the gene can be expressed in, and thepolypeptide purified from, a cell (e.g., a cultured cell) by using anyof the numerous, available gene expression systems. For example, thepolypeptide antigen can be produced in a prokaryotic host (e.g., E. colior B. subtilis) or in eukaryotic cells, such as yeast cells or insectcells (e.g., by using a baculovirus-based expression vector).

[0097] Proteins and polypeptides can also be produced in plant cells, ifdesired. For plant cells viral expression vectors (e.g., cauliflowermosaic virus and tobacco mosaic virus) and plasmid expression vectors(e.g., Ti plasmid) are suitable. Such cells are available from a widerange of sources (e.g., the American Type Culture Collection, Rockland,Md.; also, see, e.g., Ausubel et al., Current Protocols in MolecularBiology, John Wiley & Sons, New York, 1994). The optimal methods oftransformation or transfection and the choice of expression vehicle willdepend on the host system selected. Transformation and transfectionmethods are described, e.g., in Ausubel et al., supra; expressionvehicles may be chosen from those provided, e.g., in Cloning Vectors: ALaboratory Manual (P.H. Pouwels et al., 1985, Supp. 1987). The hostcells harboring the expression vehicle can be cultured in conventionalnutrient media, adapted as needed for activation of a chosen gene,repression of a chosen gene, selection of transformants, oramplification of a chosen gene.

[0098] If desired, GEP polypeptides or their homologs or orthologs canbe produced as fusion proteins. For example, the expression vectorpUR278 (Ruther et al., EMBO J., 2:1791, 1983) can be used to create lacZfusion proteins. The art-known pGEX vectors can be used to expressforeign polypeptides as fusion proteins with glutathione S-transferase(GST). In general, such fusion proteins are soluble and can be easilypurified from lysed cells by adsorption to glutathione-agarose beadsfollowed by elution in the presence of free glutathione. The pGEXvectors are designed to include thrombin or factor Xa protease cleavagesites so that the cloned target gene product can be released from theGST moiety.

[0099] In an exemplary insect cell expression system, a baculovirus suchas Autographa californica nuclear polyhedrosis virus (AcNPV), whichgrows in Spodoptera frugiperda cells, can be used as a vector to expressforeign genes. A coding sequence encoding a GEP polypeptide or homologor ortholog can be cloned into a non-essential region (for example thepolyhedrin gene) of the viral genome and placed under control of apromoter, e.g., the polyhedrin promoter or an exogenous promoter.Successful insertion of a gene encoding a GEP polypeptide or homolog orortholog can result in inactivation of the polyhedrin gene andproduction of non-occluded recombinant virus (i.e., virus lacking theproteinaceous coat encoded by the polyhedrin gene). These recombinantviruses are then used to infect insect cells (e.g., Spodopterafrugiperda cells) in which the inserted gene is expressed (see, e.g.,Smith et al., J. Virol., 46:584, 1983; Smith, U.S. Pat. No. 4,215,051).

[0100] In mammalian host cells, a number of viral-based expressionsystems can be utilized. When an adenovirus is used as an expressionvector, the nucleic acid sequence encoding the GEP polypeptide orhomolog or ortholog can be ligated to an adenovirustranscription/translation control complex, e.g., the late promoter andtripartite leader sequence. This chimeric gene can then be inserted intothe adenovirus genome by in vitro or in vivo recombination. Insertioninto a non-essential region of the viral genome (e.g., region E1 or E3)will result in a recombinant virus that is viable and capable ofexpressing a essential gene product in infected hosts (see, e.g., Logan,Proc. Natl. Acad. Sci. USA, 81:3655, 1984).

[0101] Specific initiation signals may be required for efficienttranslation of inserted nucleic acid sequences. These signals includethe ATG initiation codon and adjacent sequences. In general, exogenoustranslational control signals, including, perhaps, the ATG initiationcodon, should be provided. Furthermore, the initiation codon must be inphase with the reading frame of the desired coding sequence to ensuretranslation of the entire sequence. These exogenous translationalcontrol signals and initiation codons can be of a variety of origins,both natural and synthetic. The efficiency of expression may be enhancedby the inclusion of appropriate transcription enhancer elements, ortranscription terminators (Bittner et al., Methods in Enzymol., 153:516,1987).

[0102] The GEP polypeptides and homologs and orthologs can be expressedindividually or as fusions with a heterologous polypeptide, such as asignal sequence or other polypeptide having a specific cleavage site atthe N-and/or C-terminus of the protein or polypeptide. The heterologoussignal sequence selected should be one that is recognized and processed,i.e., cleaved by a signal peptidase, by the host cell in which thefusion protein is expressed.

[0103] A host cell can be chosen that modulates the expression of theinserted sequences, or modifies and processes the gene product in aspecific, desired fashion. Such modifications and processing (e.g.,cleavage) of protein products may facilitate optimal functioning of theprotein. Various host cells have characteristic and specific mechanismsfor post-translational processing and modification of proteins and geneproducts. Appropriate cell lines or host systems familiar to those ofskill in the art of molecular biology can be chosen to ensure thecorrect modification and processing of the foreign protein expressed. Tothis end, eukaryotic host cells that possess the cellular machinery forproper processing of the primary transcript, and phosphorylation of thegene product can be used. Such mammalian host cells include, but are notlimited to, CHO, VERO, BHK, HeLa, COS, MDCK, 293, 3T3, WI38, and choroidplexus cell lines.

[0104] If desired, the GEP polypeptide or homolog or ortholog thereofcan be produced by a stably-transfected mammalian cell line. A number ofvectors suitable for stable transection of mammalian cells are availableto the public, see, e.g., Pouwels et al. (supra); methods forconstructing such cell lines are also publicly known, e.g., in Ausubelet al. (supra). In one example, DNA encoding the protein is cloned intoan expression vector that includes the dihydrofolate reductase (DHFR)gene. Integration of the plasmid and, therefore, the GEPpolypeptide-encoding gene into the host cell chromosome is selected forby including 0.01-300 μM methotrexate in the cell culture medium (asdescribed in Ausubel et al., supra). This dominant selection can beaccomplished in most cell types.

[0105] Recombinant protein expression can be increased by DHFR-mediatedamplification of the transfected gene. Methods for selecting cell linesbearing gene amplifications are described in Ausubel et al. (supra);such methods generally involve extended culture in medium containinggradually increasing levels of methotrexate. DHFR-containing expressionvectors commonly used for this purpose include pCVSEII-DHFR andpAdD26SV(A) (described in Ausubel et al., supra).

[0106] A number of other selection systems can be used, including butnot limited to, herpes simplex virus thymidine kinase genes,hypoxanthine-guanine phosphoribosyl-transferase genes, and adeninephosphoribosyltransferase genes, which can be employed in tk, hgprt, oraprt cells, respectively. In addition, gpt, which confers resistance tomycophenolic acid (Mulligan et al., Proc. Natl. Acad. Sci. USA, 78:2072,1981); neo, which confers resistance to the aminoglycoside G-418(Colberre-Garapin et al., J. Mol. Biol., 150:1, 1981); and hygro, whichconfers resistance to hygromycin (Santerre et al., Gene, 30:147, 1981),can be used.

[0107] Alternatively, any fusion protein can be readily purified byutilizing an antibody or other molecule that specifically binds to thefusion protein being expressed. For example, a system described inJanknecht et al., Proc. Natl. Acad. Sci. USA, 88:8972 (1981), allows forthe ready purification of non-denatured fusion proteins expressed inhuman cell lines. In this system, the gene of interest is subcloned intoa vaccinia recombination plasmid such that the gene's open reading frameis translationally fused to an amino-terminal tag consisting of sixhistidine residues. Extracts from cells infected with recombinantvaccinia virus are loaded onto Ni²⁺ nitriloacetic acid-agarose columns,and histidine-tagged proteins are selectively eluted withimidazole-containing buffers.

[0108] Alternatively, a GEP polypeptide or homolog or ortholog, or aportion thereof, can be fused to an immunoglobulin Fc domain. Such afusion protein can be readily purified using a protein A column, forexample. Moreover, such fusion proteins permit the production of achimeric form of a GEP polypeptide or homolog or ortholog havingincreased stability in vivo.

[0109] Once the recombinant GEP polypeptide (or homolog or ortholog) isexpressed, it can be isolated (i.e., purified). Secreted forms of thepolypeptides can be isolated from cell culture media, while non-secretedforms must be isolated from the host cells. Polypeptides can be isolatedby affinity chromatography. For example, an anti-gep103 antibody (e.g.,produced as described herein) can be attached to a column and used toisolate the protein. Lysis and fractionation of cells harboring theprotein prior to affinity chromatography can be performed by standardmethods (see, e.g., Ausubel et al., supra). Alternatively, a fusionprotein can be constructed and used to isolate a GEP polypeptide (e.g.,a gep103-maltose binding fusion protein, a gep-103-β-galactosidasefusion protein, or a gep103-trpE fusion protein; see, e.g., Ausubel etal., supra; New England Biolabs Catalog, Beverly, Mass.). Therecombinant protein can, if desired, be further purified, e.g., by highperformance liquid chromatography using standard techniques (see, e.g.,Fisher, Laboratory Techniques In Biochemistry And Molecular Biology,eds., Work and Burdon, Elsevier, 1980).

[0110] Given the amino acid sequences described herein, polypeptidesuseful in practicing the invention, particularly fragments of GEPpolypeptides can be produced by standard chemical synthesis (e.g., bythe methods described in Solid Phase Peptide Synthesis, 2nd ed., ThePierce Chemical Co., Rockford, Ill., 1984) and used as antigens, forexample.

[0111] Antibodies

[0112] The GEP polypeptides (or antigenic fragments or analogs of suchpolypeptides) can be used to raise antibodies useful in the invention,and such polypeptides can be produced by recombinant or peptidesynthetic techniques (see, e.g., Solid Phase Peptide Synthesis, supra;Ausubel et al., supra). Likewise, antibodies can be raised against theGEP homologs and orthologs. In general, the polypeptides can be coupledto a carrier protein, such as KLH, as described in Ausubel et al.,supra, mixed with an adjuvant, and injected into a host mammal.Antibodies can be purified, for example, by affinity chromatographymethods in which the polypeptide antigen is immobilized on a resin.

[0113] In particular, various host animals can be immunized by injectionof a polypeptide of interest. Examples of suitable host animals includerabbits, mice, guinea pigs, and rats. Various adjuvants can be used toincrease the immunological response, depending on the host species,including but not limited to Freund's (complete and incompleteadjuvant), adjuvant mineral gels such as aluminum hydroxide, surfaceactive substances such as lysolecithin, pluronic polyols, polyanions,peptides, oil emulsions, keyhole limpet hemocyanin, dinitrophenol, BCG(bacille Calmette-Guerin) and Corynebacterium parvum. Polyclonalantibodies are heterogeneous populations of antibody molecules derivedfrom the sera of the immunized animals.

[0114] Antibodies useful in the invention include monoclonal antibodies,polyclonal antibodies, humanized or chimeric antibodies, single chainantibodies, Fab fragments, F(ab′)₂ fragments, and molecules producedusing a Fab expression library.

[0115] Monoclonal antibodies (mAbs), which are homogeneous populationsof antibodies to a particular antigen, can be prepared using the GEPpolypeptides or homologs or orthologs thereof and standard hybridomatechnology (see, e.g., Kohler et al., Nature, 256:495, 1975; Kohler etal., Eur. J. Immunol., 6:511, 1976; Kohler et al., Eur. J. Immunol.,6:292, 1976; Hammerling et al., In Monoclonal Antibodies and T CellHybridomas, Elsevier, N.Y., 1981; Ausubel et al., supra) .

[0116] In particular, monoclonal antibodies can be obtained by anytechnique that provides for the production of antibody molecules bycontinuous cell lines in culture, such as those described in Kohler etal., Nature, 256:495, 1975, and U.S. Pat. No. 4,376,110; the humanB-cell hybridoma technique (Kosbor et al., Immunology Today, 4:72, 1983;Cole et al., Proc. Natl. Acad. Sci. USA, 80:2026, 1983); and theEBV-hybridoma technique (Cole et al., Monoclonal Antibodies and CancerTherapy, Alan R. Liss, Inc., pp. 77-96, 1983). Such antibodies can be ofany immunoglobulin class including IgG, IgM, IgE, IgA, IgD, and anysubclass thereof. The hybridomas producing the mAbs of this inventioncan be cultivated in vitro or in vivo.

[0117] Once produced, polyclonal or monoclonal antibodies are tested forspecific recognition of a GEP polypeptide or homolog or ortholog thereofin an immunoassay, such as a Western blot or immunoprecipitationanalysis using standard techniques, e.g., as described in Ausubel etal., supra. Antibodies that specifically bind to the GEP polypeptides,or conservative variants and homologs or orthologs thereof, are usefulin the invention. For example, such antibodies can be used in animmunoassay to detect a GEP polypeptide in pathogenic or non-pathogenicstrains of bacteria.

[0118] Preferably, antibodies of the invention are produced usingfragments of the GEP polypeptides that appear likely to be antigenic, bycriteria such as high frequency of charged residues. In one specificexample, such fragments are generated by standard techniques of PCR, andare then cloned into the pGEX expression vector (Ausubel et al., supra).Fusion proteins are expressed in E. coli and purified using aglutathione agarose affinity matrix as described in Ausubel, et al.,supra.

[0119] If desired, several (e.g., two or three) fusions can be generatedfor each protein, and each fusion can be injected into at least tworabbits. Antisera can be raised by injections in a series, typicallyincluding at least three booster injections. Typically, the antisera ischecked for its ability to immunoprecipitate a recombinant GEPpolypeptide or homolog or ortholog, or unrelated control proteins, suchas glucocorticoid receptor, chloramphenicol acetyltransferase, orluciferase.

[0120] Techniques developed for the production of “chimeric antibodies”(Morrison et al., Proc. Natl. Acad. Sci., 81:6851, 1984; Neuberger etal., Nature, 312:604, 1984; Takeda et al., Nature, 314:452, 1984) can beused to splice the genes from a mouse antibody molecule of appropriateantigen specificity together with genes from a human antibody moleculeof appropriate biological activity. A chimeric antibody is a molecule inwhich different portions are derived from different animal species, suchas those having a variable region derived from a murine mAb and a humanimmunoglobulin constant region.

[0121] Alternatively, techniques described for the production of singlechain antibodies (U.S. Pat. No. 4,946,778; and U.S. Pat. Nos. 4,946,778and 4,704,692) can be adapted to produce single chain antibodies againsta GEP polypeptide or homolog or ortholog. Single chain antibodies areformed by linking the heavy and light chain fragments of the Fv regionvia an amino acid bridge, resulting in a single chain polypeptide.

[0122] Antibody fragments that recognize and bind to specific epitopescan be generated by known techniques. For example, such fragments caninclude but are not limited to F(ab′)₂ fragments, which can be producedby pepsin digestion of the antibody molecule, and Fab fragments, whichcan be generated by reducing the disulfide bridges of F(ab′)₂ fragments.Alternatively, Fab expression libraries can be constructed (Huse et al.,Science, 246:1275, 1989) to allow rapid and easy identification ofmonoclonal Fab fragments with the desired specificity.

[0123] Polyclonal and monoclonal antibodies that specifically bind toGEP polypeptides or homologs or orthologs can be used, for example, todetect expression of a GEP gene or homolog or ortholog in another strainof bacteria. For example, a GEP polypeptide can be readily detected inconventional immunoassays of bacteria cells or extracts. Examples ofsuitable assays include, without limitation, Western blotting, ELISAs,radioimmune assays, and the like.

[0124] Assay for Antibacterial Agents

[0125] The invention provides a method for identifying an antibacterialagent(s). Although the inventors are not bound by any particular theoryas to the biological mechanism involved, the new antibacterial agentsare thought to inhibit specifically (1) the function of a polypeptide(s)encoded by a nucleic acid located within an operon containing a GEPgene, or (2) expression of the a gene located within an operoncontaining a GEP gene, or homologs or orthologs thereof. Screening forantibacterial agents can be rapidly accomplished by identifying thosecompounds (e.g., polypeptides or small molecules) that specifically bindto a polypeptide encoded by a nucleic acid located within an operoncontaining a GEP gene. A homolog or ortholog of a GEP polypeptide can besubstituted for the GEP polypeptide in the methods summarized herein.Specific binding of a test compound to a polypeptide can be detected,for example, in vitro by reversibly or irreversibly immobilizing thetest compound(s) on a substrate, e.g., the surface of a well of a96-well polystyrene microtitre plate. Methods for immobilizingpolypeptides and other small molecules are well known in the art. Forexample, the microtitre plates can be coated with a polypeptide encodedby a nucleic acid located within an operon containing a GEP gene (e.g.,a GEP polypeptide or a combination of GEP polypeptides and/or homologsand/or orthologs) by adding the polypeptide(s) in a solution (typically,at a concentration of 0.05 to 1 mg/ml in a volume of 1-100 μl) to eachwell, and incubating the plates at room temperature to 37° C. for 0.1 to36 hours. Polypeptides that are not bound to the plate can be removed byshaking the excess solution from the plate, and then washing the plate(once or repeatedly) with water or a buffer. Typically, the polypeptide,homolog, or ortholog is contained in water or a buffer. The plate isthen washed with a buffer that lacks the bound polypeptide. To block thefree protein-binding sites on the plates, the plates are blocked with aprotein that is unrelated to the bound polypeptide. For example, 300 μlof bovine serum albumin (BSA) at a concentration of 2 mg/ml in Tris-HClis suitable. Suitable substrates include those substrates that contain adefined cross-linking chemistry (e.g., plastic substrates, such aspolystyrene, styrene, or polypropylene substrates from Corning CostarCorp. (Cambridge, Mass.), for example). If desired, a beaded particle,e.g., beaded agarose or beaded sepharose, can be used as the substrate.

[0126] Binding of the test compound to the new polypeptides (or homologsor orthologs thereof) can be detected by any of a variety of art-knownmethods. For example, an antibody that specifically binds to a GEPpolypeptide can be used in an immunoassay. If desired, the antibody canbe labeled (e.g., fluorescently or with a radioisotope) and detecteddirectly (see, e.g., West and McMahon, J. Cell Biol. 74:264, 1977).Alternatively, a second antibody can be used for detection (e.g., alabeled antibody that binds to the Fc portion of an anti-GEP103antibody). In an alternative detection method, the GEP polypeptide islabeled, and the label is detected (e.g., by labeling a GEP polypeptidewith a radioisotope, fluorophore, chromophore, or the like). In stillanother method, the GEP polypeptide is produced as a fusion protein witha protein that can be detected optically, e.g., green fluorescentprotein (which can be detected under UV light). In an alternativemethod, the polypeptide (e.g., gep103) can be produced as a fusionprotein with an enzyme having a detectable enzymatic activity, such ashorse radish peroxidase, alkaline phosphatase, β-galactosidase, orglucose oxidase. Genes encoding all of these enzymes have been clonedand are readily available for use by those of skill in the art. Ifdesired, the fusion protein can include an antigen, and such an antigencan be detected and measured with a polyclonal or monoclonal antibodyusing conventional methods. Suitable antigens include enzymes (e.g.,horse radish peroxidase, alkaline phosphatase, and β-galactosidase) andnon-enzymatic polypeptides (e.g., serum proteins, such as BSA andglobulins, and milk proteins, such as caseins).

[0127] In various in vivo methods for identifying polypeptides that bindto GEP polypeptides, the conventional two-hybrid assays ofprotein/protein interactions can be used (see e.g., Chien et al., Proc.Natl. Acad. Sci. USA, 88:9578, 1991; Fields et al., U.S. Pat. No.5,283,173; Fields and Song, Nature, 340:245, 1989; Le Douarin et al.,Nucleic Acids Research, 23:876, 1995; Vidal et al., Proc. Natl. Acad.Sci. USA, 93:10315-10320, 1996; and White, Proc. Natl. Acad. Sci. USA,93:10001-10003, 1996). Kits for practicing various two-hybrid methodsare commercially available (e.g., from Clontech; Palo Alto, Calif.).

[0128] Generally, the two-hybrid methods involve in vivo reconstitutionof two separable domains of a transcription factor. The DNA bindingdomain (DB) of the transcription factor is required for recognition of achosen promoter. The activation domain (AD) is required for contactingother components of the host cell's transcriptional machinery. Thetranscription factor is reconstituted through the use of hybridproteins. One hybrid is composed of the AD and a first protein ofinterest. The second hybrid is composed of the DB and a second proteinof interest.

[0129] Useful reporter genes are those that are operably linked to apromoter which is specifically recognized by the DB. Typically, thetwo-hybrid system employs the yeast Saccharomyces cerevisiae andreporter genes, the expression of which can be selected underappropriate conditions. Other eukaryotic cells, including mammalian andinsect cells, can be used, if desired. The two-hybrid system provides aconvenient method for cloning a gene encoding a polypeptide (i.e., acandidate antibacterial agent) that binds to a second, preselectedpolypeptide (e.g., gep103). Typically, though not necessarily, a DNAlibrary is constructed such that randomly generated sequences are fusedto the AD, and the protein of interest (e.g., gep103) is fused to theDB.

[0130] In such two-hybrid methods, two fusion proteins are produced. Onefusion protein contains the GEP polypeptide (or homolog or orthologthereof) fused to either a transactivator domain or DNA binding domainof a transcription factor (e.g., of Gal4). The other fusion proteincontains a test polypeptide fused to either the DNA binding domain or atransactivator domain of a transcription factor. Once brought togetherin a single cell (e.g., a yeast cell or mammalian cell), one of thefusion proteins contains the transactivator domain and the other fusionprotein contains the DNA binding domain. Therefore, binding of the GEPpolypeptide to the test polypeptide (i.e., candidate antibacterialagent) reconstitutes the transcription factor. Reconstitution of thetranscription factor can be detected by detecting expression of a gene(i.e., a reporter gene) that is operably linked to a DNA sequence thatis bound by the DNA binding domain of the transcription factor.

[0131] The methods described above can be used for high throughputscreening of numerous test compounds to identify candidate antibacterial(or anti-bacterial) agents. Having identified a test compound as acandidate antibacterial agent, the candidate antibacterial agent can befurther tested for inhibition of bacterial growth in vitro or in vivo(e.g., using an animal, e.g., rodent, model system) if desired. Usingother, art-known variations of such methods, one can test the ability ofa nucleic acid (e.g., DNA or RNA) used as the test compound to bind to apolypeptide encoded by a nucleic acid sequence located within an operoncontaining a GEP gene or homolog or ortholog thereof.

[0132] In vitro, further testing can be accomplished by means known tothose in the art such as an enzyme inhibition assay or a whole-cellbacterial growth inhibition assay. For example, an agar dilution assayidentifies a substance that inhibits bacterial growth. Microtiter platesare prepared with serial dilutions of the test compound; adding to thepreparation a given amount of growth substrate; and providing apreparation of Streptococcus cells. Inhibition of growth is determined,for example, by observing changes in optical densities of the bacterialcultures.

[0133] Inhibition of bacterial growth is demonstrated, for example, bycomparing (in the presence and absence of a test compound) the rate ofgrowth or the absolute growth of bacterial cells. Inhibition includes areduction of one of the above measurements by at least 20% (e.g., atleast 25%, 30%, 40%, 50%, 75%, 80%, or 90%).

[0134] Rodent (e.g., murine) and rabbit animal models of streptococcalinfections are known to those of skill in the art, and such animal modelsystems are accepted for screening antibacterial agents as an indicationof their therapeutic efficacy in human patients. In a typical in vivoassay, an animal is infected with a pathogenic Streptococcus strain,e.g., by inhalation of Streptococcus pneumoniae, and conventionalmethods and criteria are used to diagnose the mammal as being afflictedwith streptococcal pneumonia. The candidate antibacterial agent then isadministered to the mammal at a dosage of 1-100 mg/kg of body weight,and the mammal is monitored for signs of amelioration of disease.Alternatively, the test compound can be administered to the mammal priorto infecting the mammal with Streptococcus, and the ability of thetreated mammal to resist infection is measured. of course, the resultsobtained in the presence of the test compound should be compared withresults in control animals, which are not treated with the testcompound. Administration of candidate antibacterial agent to the mammalcan be carried out as described below, for example.

[0135] Pharmaceutical Formulations

[0136] Treatment includes administering a pharmaceutically effectiveamount of a composition containing an antibacterial agent to a subjectin need of such treatment, thereby inhibiting bacterial growth in thesubject. Such a composition typically contains from about 0.1 to 90% byweight (such as 1 to 20% or 1 to 10%) of an antibacterial 10 agent ofthe invention in a pharmaceutically acceptable carrier.

[0137] Solid formulations of the compositions for oral administrationmay contain suitable carriers or excipients, such as corn starch,gelatin, lactose, acacia, sucrose, microcrystalline cellulose, kaolin,mannitol, dicalcium phosphate, calcium carbonate, sodium chloride, oralginic acid. Disintegrators that can be used include, withoutlimitation, micro-crystalline cellulose, corn starch, sodium starchglycolate and alginic acid. Tablet binders that may be used includeacacia, methylcellulose, sodium carboxymethylcellulose,polyvinylpyrrolidone (Povidone), hydroxypropyl methylcellulose, sucrose,starch, and ethylcellulose. Lubricants that may be used includemagnesium stearates, stearic acid, silicone fluid, talc, waxes, oils,and colloidal silica.

[0138] Liquid formulations of the compositions for oral administrationprepared in water or other aqueous vehicles may contain varioussuspending agents such as methylcellulose, alginates, tragacanth,pectin, kelgin, carrageenan, acacia, polyvinylpyrrolidone, and polyvinylalcohol. The liquid formulations may also include solutions, emulsions,syrups and elixirs containing, together with the active compound(s),wetting agents, sweeteners, and coloring and flavoring agents. Variousliquid and powder formulations can be prepared by conventional methodsfor inhalation into the lungs of the mammal to be treated.

[0139] Injectable formulations of the compositions may contain variouscarriers such as vegetable oils, dimethylacetamide, dimethylformamide,ethyl lactate, ethyl carbonate, isopropyl myristate, ethanol, polyols(glycerol, propylene glycol, liquid polyethylene glycol, and the like).For intravenous injections, water soluble versions of the compounds maybe administered by the drip method, whereby a pharmaceutical formulationcontaining the antibacterial agent and a physiologically acceptableexcipient is infused. Physiologically acceptable excipients may include,for example, 5% dextrose, 0.9% saline, Ringer's solution or othersuitable excipients. Intramuscular preparations, a sterile formulationof a suitable soluble salt form of the compounds can be dissolved andadministered in a pharmaceutical excipient such as Water-for-Injection,0.9% saline, or 5% glucose solution. A suitable insoluble form of thecompound may be prepared and administered as a suspension in an aqueousbase or a pharmaceutically acceptable oil base, such as an ester of along chain fatty acid, (e.g., ethyl oleate).

[0140] A topical semi-solid ointment formulation typically contains aconcentration of the active ingredient from about 1 to 20%, e.g., 5 to10% in a carrier such as a pharmaceutical cream base. Variousformulations for topical use include drops, tinctures, lotions, creams,solutions, and ointments containing the active ingredient and varioussupports and vehicles.

[0141] The optimal percentage of the antibacterial agent in eachpharmaceutical formulation varies according to the formulation itselfand the therapeutic effect desired in the specific pathologies andcorrelated therapeutic regimens. Appropriate dosages of theantibacterial agents can readily be determined by those of ordinaryskill in the art of medicine by monitoring the mammal for signs ofdisease amelioration or inhibition, and increasing or decreasing thedosage and/or frequency of treatment as desired. The optimal amount ofthe antibacterial compound used for treatment of conditions caused by orcontributed to by bacterial infection may depend upon the manner ofadministration, the age and the body weight of the subject and thecondition of the subject to be treated. Generally, the antibacterialcompound is administered at a dosage of 1 to 100 mg/kg of body weight,and typically at a dosage of 1 to 10 mg/kg of body weight.

EXAMPLE

[0142] Using the transposon-based mutagenesis methods described above,the Streptococcus pneumonia genome was mutagenized, and 23 genes wereidentified as being located within operons that are essential forsurvival of Streptococcus pneumonia. These genes are listed in Table 1,above, and their nucleic acid and amino acid sequences are representedby SEQ ID NOs: 1-69, as shown in FIGS. 1-23.

[0143] Now that each of these genes is known to be located within anoperon that is essential for survival of Streptococcus, the polypeptidesencoded by nucleic acids located within those operons can be used toidentify antibacterial agents by using the assays described herein.Other art-known assays to detect interactions of test compounds withproteins, or to detect inhibition of bacterial growth also can be usedwith the nucleic acids located within operons containing the GEP genes,and gene products and homologs or orthologs thereof.

Other Embodiments

[0144] The invention also features fragments, variants, analogs, andderivatives of the GEP polypeptides described above that retain one ormore of the biological activities of the GEP polypeptides, e.g., asdetermined in a complementation assay. Also included within theinvention are naturally-occurring and non-naturally-occurring allelicvariants. Compared with the naturally-occurring GEP gene, sequencesdepicted in FIGS. 1-23, the nucleic acid sequence encoding allelicvariants may have a substitution, deletion, or addition of one or morenucleotides. The preferred allelic variants are functionally equivalentto a GEP polypeptide, e.g., as determined in a complementation assay.

[0145] It is to be understood that, while the invention has beendescribed in conjunction with the detailed description thereof, theforegoing description is intended to illustrate and not limit the scopeof the invention, which is defined by the scope of the appended claims.Other aspects, advantages, and modifications are within the scope of thefollowing claims.

What is claimed is:
 1. An isolated operon comprising a nucleotidesequence, or an allelic variant or homolog of the nucleotide sequence,encoding: a gep103 polypeptide comprising the amino acid sequence of SEQID NO: 1, as depicted in FIG. 1; a gep1119 polypeptide comprising theamino acid sequence of SEQ ID NO: 4, as depicted in FIG. 2; a gep1122polypeptide comprising the amino acid sequence of SEQ ID NO: 7, asdepicted in FIG. 3; a gep1315 polypeptide comprising the amino acidsequence of SEQ ID NO: 10, as depicted in FIG. 4; a gep1493 polypeptidecomprising the amino acid sequence of SEQ ID NO: 13, as depicted in FIG.5; a gep1507 polypeptide comprising the amino acid sequence of SEQ IDNO: 16, as depicted in FIG. 6; a gep1511 polypeptide comprising theamino acid sequence of SEQ ID NO: 19, as depicted in FIG. 7; a gep1518polypeptide comprising the amino acid sequence of SEQ ID NO: 22, asdepicted in FIG. 8; a gep1546 polypeptide comprising the amino acidsequence of SEQ ID NO: 25, as depicted in FIG. 9; a gep1551 polypeptidecomprising the amino acid sequence of SEQ ID NO: 28, as depicted in FIG.10; a gep1561 polypeptide comprising the amino acid sequence of SEQ IDNO: 31, as depicted in FIG. 11; a gep1580 polypeptide comprising theamino acid sequence of SEQ ID NO: 34, as depicted in FIG. 12; a gep1713polypeptide comprising the amino acid sequence of SEQ ID NO: 37 asdepicted in FIG. 13; a gep222 polypeptide comprising the amino acidsequence of SEQ ID NO: 40, as depicted in FIG. 14; a gep2283 polypeptidecomprising the amino acid sequence of SEQ ID NO: 43, as depicted in FIG.15; a gep273 polypeptide comprising the amino acid sequence of SEQ IDNO: 46, as depicted in FIG. 16; a gep286 polypeptide comprising theamino acid sequence of SEQ ID NO: 49, as depicted in FIG. 17; a gep311polypeptide comprising the amino acid sequence of SEQ ID NO: 52, asdepicted in FIG. 18; a gep3262 polypeptide comprising the amino acidsequence of SEQ ID NO: 55, as depicted in FIG. 19; a gep3387 polypeptidecomprising the amino acid sequence of SEQ ID NO: 58, as depicted in FIG.20; a gep47 polypeptide comprising the amino acid sequence of SEQ ID NO:61, as depicted in FIG. 21; a gep61 polypeptide comprising the aminoacid sequence of SEQ ID NO: 64, as depicted in FIG. 22; or a gep76polypeptide comprising the amino acid sequence of SEQ ID NO: 67, asdepicted in FIG.
 23. 2. An isolated nucleic acid molecule comprising anucleic acid sequence selected from the group consisting of: (1) anoperon comprising the sequence of SEQ ID NO: 2, as depicted in FIG. 1,or degenerate variants thereof; (2) an operon comprising the sequence ofSEQ ID NO: 2, or degenerate variants thereof, wherein T is replaced byU; (3) nucleic acids complementary to (1) and (2); (4) fragments of (1),(2), and (3) that are at least 15 base pairs in length and whichhybridize under stringent conditions to genomic DNA encoding thepolypeptide of SEQ ID NO: 1; (5) an operon comprising the sequence ofSEQ ID NO: 5, as depicted in FIG. 2, or degenerate variants thereof; (6)an operon comprising the sequence of SEQ ID NO: 5, or degeneratevariants thereof, wherein T is replaced by U; (7) nucleic acidscomplementary to (5) and (6); (8) fragments of (5), (6), and (7) thatare at least 15 base pairs in length and which hybridize under stringentconditions to genomic DNA encoding the polypeptide of SEQ ID NO: 4; (9)an operon comprising the sequence of SEQ ID NO: 8, as depicted in FIG.3, or degenerate variants thereof; (10) an operon comprising thesequence of SEQ ID NO: 8, or degenerate variants thereof, wherein T isreplaced by U; (11) nucleic acids complementary to (9) and (10); (12)fragments of (9), (10), and (11) that are at least 15 base pairs inlength and which hybridize under stringent conditions to genomic DNAencoding the polypeptide of SEQ ID NO: 7; (13) an operon comprising thesequence of SEQ ID NO: 11, as depicted in FIG. 4, or degenerate variantsthereof; (14) an operon comprising the sequence of SEQ ID NO: 11, ordegenerate variants thereof, wherein T is replaced by U; (15) nucleicacids complementary to (13) and (14); and (16) fragments of (13), (14),and (15) that are at least 15 base pairs in length and which hybridizeunder stringent conditions to genomic DNA encoding the polypeptide ofSEQ ID NO: 10; (17) an operon comprising the sequence of SEQ ID NO: 14,as depicted in FIG. 5, or degenerate variants thereof; (18) an operoncomprising the sequence of SEQ ID NO: 14, or degenerate variantsthereof, wherein T is replaced by U; (19) nucleic acids complementary to(17) and (18); (20) fragments of (17), (18), and (19) that are at least15 base pairs in length and which hybridize under stringent conditionsto genomic DNA encoding the polypeptide of SEQ ID NO: 13; (21) an operoncomprising the sequence of SEQ ID NO: 17, as depicted in FIG. 6, ordegenerate variants thereof; (22) an operon comprising the sequence ofSEQ ID NO: 17, or degenerate variants thereof, wherein T is replaced byU; (23) nucleic acids complementary to (21) and (22); (24) fragments of(21), (22), and (23) that are at least 15 base pairs in length and whichhybridize under stringent conditions to genomic DNA encoding thepolypeptide of SEQ ID NO: 16; (25) an operon comprising the sequence ofSEQ ID NO: 20, as depicted in FIG. 7, or degenerate variants thereof;(26) an operon comprising the sequence of SEQ ID NO: 20, or degeneratevariants thereof, wherein T is replaced by U; (27) nucleic acidscomplementary to (25) and (26); (28) fragments of (25), (26), and (27)that are at least 15 base pairs in length and which hybridize understringent conditions to genomic DNA encoding the polypeptide of SEQ IDNO: 19; (29) an operon comprising the sequence of SEQ ID NO: 23, asdepicted in FIG. 8, or degenerate variants thereof; (30) an operoncomprising the sequence of SEQ ID NO: 23, or degenerate variantsthereof, wherein T is replaced by U; (31) nucleic acids complementary to(29) and (30); and (32) fragments of (39), (30), and (31) that are atleast 15 base pairs in length and which hybridize under stringentconditions to genomic DNA encoding the polypeptide of SEQ ID NO: 22;(33) an operon comprising the sequence of SEQ ID NO: 26, as depicted inFIG. 9, or degenerate variants thereof; (34) an operon comprising thesequence of SEQ ID NO: 26, or degenerate variants thereof, wherein T isreplaced by U; (35) nucleic acids complementary to (33) and (34); (36)fragments of (33), (34), and (35) that are at least 15 base pairs inlength and which hybridize under stringent conditions to genomic DNAencoding the polypeptide of SEQ ID NO: 25; (37) an operon comprising thesequence of SEQ ID NO;29, as depicted in FIG. 10, or degenerate variantsthereof; (38) an operon comprising the sequence of SEQ ID NO: 29, ordegenerate variants thereof, wherein T is replaced by U; (39) nucleicacids complementary to (37) and (38); (40) fragments of (37), (38), and(39) that are at least 15 base pairs in length and which hybridize understringent conditions to genomic DNA encoding the polypeptide of SEQ IDNO: 28; (41) an operon comprising the sequence of SEQ ID NO: 32, asdepicted in FIG. 11, or degenerate variants thereof; (42) an operoncomprising the sequence of SEQ ID NO: 32, or degenerate variantsthereof, wherein T is replaced by U; (43) nucleic acids complementary to(41) and (42); (44) fragments of (41), (42) , and (43) that are at least15 base pairs in length and which hybridize under stringent conditionsto genomic DNA encoding the polypeptide of SEQ ID NO: 31; (45) an operoncomprising the sequence of SEQ ID NO: 35, as depicted in FIG. 12, ordegenerate variants thereof; (46) an operon comprising the sequence ofSEQ ID NO: 35, or degenerate variants thereof, wherein T is replaced byU; (47) nucleic acids complementary to (45) and (46); and (48) fragmentsof (45), (46), and (47) that are at least 15 base pairs in length andwhich hybridize under stringent conditions to genomic DNA encoding thepolypeptide of SEQ ID NO: 34; (49) an operon comprising the sequence ofSEQ ID NO: 38, as depicted in FIG. 13, or degenerate variants thereof;(50) an operon comprising the sequence of SEQ ID NO: 38, or degeneratevariants thereof, wherein T is replaced by U; (51) nucleic acidscomplementary to (49) and (50); (52) fragments of (49), (50), and (51)that are at least 15 base pairs in length and which hybridize understringent conditions to genomic DNA encoding the polypeptide of SEQ IDNO: 37; (53) an operon comprising the sequence of SEQ ID NO: 41, asdepicted in FIG. 14, or degenerate variants thereof; (54) an operoncomprising the sequence of SEQ ID NO: 41, or degenerate variantsthereof, wherein T is replaced by U; (55) nucleic acids complementary to(53) and (54); (56) fragments of (53), (54), and (55) that are at least15 base pairs in length and which hybridize under stringent conditionsto genomic DNA encoding the polypeptide of SEQ ID NO: 40; (57) an operoncomprising the sequence of SEQ ID NO: 44, as depicted in FIG. 15, ordegenerate variants thereof; (58) an operon comprising the sequence ofSEQ ID NO: 44, or degenerate variants thereof, wherein T is replaced byU; (59) nucleic acids complementary to (57) and (58); (60) fragments of(57), (58), and (59) that are at least 15 base pairs in length and whichhybridize under stringent conditions to genomic DNA encoding thepolypeptide of SEQ ID NO: 39; (61) an operon comprising the sequence ofSEQ ID NO: 47, as depicted in FIG. 16, or degenerate variants thereof;(62) an operon comprising the sequence of SEQ ID NO: 47, or degeneratevariants thereof, wherein T is replaced by U; (63) nucleic acidscomplementary to (61) and (62); and (64) fragments of (61), (62), and(63) that are at least 15 base pairs in length and which hybridize understringent conditions to genomic DNA encoding the polypeptide of SEQ IDNO: 46; (65) an operon comprising the sequence of SEQ ID NO: 50, asdepicted in FIG. 17, or degenerate variants thereof; (66) an operoncomprising the sequence of SEQ ID NO: 50, or degenerate variantsthereof, wherein T is replaced by U; (67) nucleic acids complementary to(65) and (66); (68) fragments of (65), (66), and (67) that are at least15 base pairs in length and which hybridize under stringent conditionsto genomic DNA encoding the polypeptide of SEQ ID NO: 49; (69) an operoncomprising the sequence of SEQ ID NO: 53, as depicted in FIG. 18, ordegenerate variants thereof; (70) an operon comprising the sequence ofSEQ ID NO: 53, or degenerate variants thereof, wherein T is replaced byU; (71) nucleic acids complementary to (69) and (70); (72) fragments of(69), (70), and (71) that are at least 15 base pairs in length and whichhybridize under stringent conditions to genomic DNA encoding thepolypeptide of SEQ ID NO: 52; (73) an operon comprising the sequence ofSEQ ID NO: 56, as depicted in FIG. 19, or degenerate variants thereof;(74) an operon comprising the sequence of SEQ ID NO: 56, or degeneratevariants thereof, wherein T is replaced by U; (75) nucleic acidscomplementary to (73) and (74); (76) fragments of (73), (74), and (75)that are at least 15 base pairs in length and which hybridize understringent conditions to genomic DNA encoding the polypeptide of SEQ IDNO: 55; (77) an operon comprising the sequence of SEQ ID NO: 59, asdepicted in FIG. 20, or degenerate variants thereof; (78) an operoncomprising the sequence of SEQ ID NO: 59, or degenerate variantsthereof, wherein T is replaced by U; (79) nucleic acids complementary to(77) and (78); and (80) fragments of (77), (78), and (79) that are atleast 15 base pairs in length and which hybridize under stringentconditions to genomic DNA encoding the polypeptide of SEQ ID NO: 58;(81) an operon comprising the sequence of SEQ ID NO: 62, as depicted inFIG. 21, or degenerate variants thereof; (82) an operon comprising thesequence of SEQ ID NO: 62, or degenerate variants thereof, wherein T isreplaced by U; (83) nucleic acids complementary to (81) and (82); (84)fragments of (81), (82), and (83) that are at least 15 base pairs inlength and which hybridize under stringent conditions to genomic DNAencoding the polypeptide of SEQ ID NO: 61; (85) an operon comprising thesequence of SEQ ID NO: 65; as depicted in FIG. 22, or degeneratevariants thereof; (86) an operon comprising the sequence of SEQ ID NO:65, or degenerate variants thereof, wherein T is replaced by U; (87)nucleic acids complementary to (85) and (86); (88) fragments of (85),(86), and (87) that are at least 15 base pairs in length and whichhybridize under stringent conditions to genomic DNA encoding thepolypeptide of SEQ ID NO: 66; (89) an operon comprising the sequence ofSEQ ID NO: 68, as depicted in FIG. 23, or degenerate variants thereof;(90) an operon comprising the sequence of SEQ ID NO: 68, or degeneratevariants thereof, wherein T is replaced by U; (91) nucleic acidscomplementary to (89) and (90); and (92) fragments of (89), (90), and(91) that are at least 15 base pairs in length and which hybridize understringent conditions to genomic DNA encoding the polypeptide of SEQ IDNO:
 67. 3. An isolated operon from Streptococcus comprising a nucleotidesequence that is at least 85% identical to a nucleotide sequenceselected from the group consisting of SEQ ID NO: 2; SEQ ID NO: 5; SEQ IDNO: 8; SEQ ID NO: 11; SEQ ID NO: 14; SEQ ID NO: 17; SEQ ID NO: 20; SEQID NO: 23; SEQ ID NO: 26; SEQ ID NO: 29; SEQ ID NO: 32; SEQ ID NO: 35;SEQ ID NO: 38; SEQ ID NO: 41; SEQ ID NO: 44; SEQ ID NO: 47; SEQ ID NO:50; SEQ ID NO: 53; SEQ ID NO: 56; SEQ ID NO: 59; SEQ ID NO: 62; SEQ IDNO: 65; and SEQ ID NO:
 68. 4. An isolated nucleic acid molecule that isat least 15 base pairs in length and hybridizes under stringentconditions to a nucleotide sequence selected from the group consistingof SEQ ID NO: 2; SEQ ID NO: 5; SEQ ID NO: 8; SEQ ID NO: 11; SEQ ID NO:14; SEQ ID NO: 17; SEQ ID NO: 20; SEQ ID NO: 23; SEQ ID NO: 26; SEQ IDNO: 29; SEQ ID NO: 32; SEQ ID NO: 35; SEQ ID NO: 38; SEQ ID NO: 41; SEQID NO: 44; SEQ ID NO: 47; SEQ ID NO: 50; SEQ ID NO: 53; SEQ ID NO: 56;SEQ ID NO: 59; SEQ ID NO: 62; SEQ ID NO: 65; and SEQ ID NO:
 68. 5. Avector comprising an operon of claim
 1. 6. A vector comprising a nucleicacid molecule of claim
 2. 7. An expression vector comprising an operonof claim 1 operably linked to a nucleotide sequence regulatory elementthat controls expression of said operon.
 8. An expression vectorcomprising a nucleic acid molecule of claim 2, wherein said nucleic acidmolecule is operably linked to a nucleotide sequence regulatory elementthat controls expression of said nucleic acid.
 9. A host cell comprisingan exogenously introduced operon of claim
 1. 10. A host cell comprisingan exogenously introduced nucleic acid molecule of claim
 2. 11. A hostcell of claim 9, wherein the cell is a yeast or bacterium.
 12. A hostcell of claim 10, wherein the cell is a yeast or bacterium.
 13. Agenetically engineered host cell comprising an operon of claim 1operably linked to a heterologous nucleotide sequence regulatory elementthat controls expression of the operon in the host cell.
 14. A host cellof claim 13, wherein the cell is a yeast or bacterium.
 15. A geneticallyengineered host cell comprising a nucleic acid molecule of claim 2operably linked to a nucleotide sequence regulatory element thatcontrols expression of the nucleic acid in the host cell.
 16. A hostcell of claim 15, wherein the cell is a yeast or bacterium.
 17. Anisolated operon comprising a nucleotide sequence encoding a polypeptidecomprising an amino acid sequence selected from the group consisting of:the amino acid sequence of SEQ ID NO: 1, as depicted in FIG. 1; theamino acid sequence of SEQ ID NO: 4, as depicted in FIG. 2; the aminoacid sequence of SEQ ID NO: 7, as depicted in FIG. 3; the amino acidsequence of SEQ ID NO: 10, as depicted in FIG. 4; the amino acidsequence of SEQ ID NO: 13, as depicted in FIG. 5; the amino acidsequence of SEQ ID NO: 16, as depicted in FIG. 6; the amino acidsequence of SEQ ID NO: 19, as depicted in FIG. 7; the amino acidsequence of SEQ ID NO: 22, as depicted in FIG. 8; the amino acidsequence of SEQ ID NO: 25, as depicted in FIG. 9; the amino acidsequence of SEQ ID NO: 28, as depicted in FIG. 10; the amino acidsequence of SEQ ID NO: 31, as depicted in FIG. 11; the amino acidsequence of SEQ ID NO: 34, as depicted in FIG. 12; the amino acidsequence of SEQ ID NO: 37, as depicted in FIG. 13; the amino acidsequence of SEQ ID NO: 40, as depicted in FIG. 14; the amino acidsequence of SEQ ID NO: 43, as depicted in FIG. 15; the amino acidsequence of SEQ ID NO: 46, as depicted in FIG. 16; the amino acidsequence of SEQ ID NO: 49, as depicted in FIG. 17; the amino acidsequence of SEQ ID NO: 52, as depicted in FIG. 18; the amino acidsequence of SEQ ID NO: 55, as depicted in FIG. 19; the amino acidsequence of SEQ ID NO: 58, as depicted in FIG. 20; the amino acidsequence of SEQ ID NO: 61, as depicted in FIG. 21; the amino acidsequence of SEQ ID NO: 64, as depicted in FIG. 22; and the amino acidsequence of SEQ ID NO: 67, as depicted in FIG.
 23. 18. An isolatedpolypeptide encoded by a nucleic acid located within an operoncomprising a nucleic acid sequence selected from the group consisting ofSEQ ID NO: 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, 35, 38, 41, 44, 47,50, 53, 56, 59, 62, 65, and
 68. 19. An isolated polypeptide, saidpolypeptide being encoded by an operon of claim
 1. 20. An isolatedpolypeptide, said polypeptide being encoded by a nucleic acid moleculeof claim
 2. 21. An isolated polypeptide, said polypeptide being encodedby an operon of claim
 3. 22. A method for identifying an antibacterialagent, the method comprising: (a) contacting a test compound with apolypeptide, or a homolog of a polypeptide, encoded by a nucleic acidsequence located within an operon comprising a GEP gene selected fromthe group consisting of gep103, gep1119, gep1122, gep1315, gep1493,gep1507, gep1511, gep1518, gep1546, gep1551, gep1561, gep1580, gep1713,gep222, gep2283, gep273, gep286, gep311, gep3262, gep3387, gep47, gep6l,and gep76; and (b) detecting binding of the test compound to thepolypeptide, wherein binding indicates that the test compound is anantibacterial agent.
 23. The method of claim 22, further comprising: (c)determining whether a test compound that binds to the polypeptideinhibits growth of bacteria, relative to growth of bacteria cultured inthe absence of a test compound that binds to the polypeptide, whereininhibition of growth indicates that the test compound is anantibacterial agent.
 24. The method of claim 22, wherein the polypeptideis selected from the group consisting of gep103, gep1119, gep1122,gep1315, gep1493, gep1507, gep1511, gep1518, gep1546, gep1551, gep1561,gep1580, gep1713, gep222, gep2283, gep273, gep286, gep311, gep3262,gep3387, gep47, gep6l, and gep76.
 25. The method of claim 22, whereinthe test compound is immobilized on a substrate, and binding of the testcompound to the polypeptide is detected as immobilization of thepolypeptide on the immobilized test compound.
 26. The method of claim25, wherein immobilization of the polypeptide on the test compound isdetected in an immunoassay with an antibody that specifically binds tothe polypeptide.
 27. The method of claim 22, wherein the test compoundis selected from the group consisting of polypeptides and smallmolecules.
 28. The method of claim 22, wherein: (a) the polypeptide isprovided as a fusion protein comprising the polypeptide fused to (i) atranscription activation domain of a transcription factor or (ii) aDNA-binding domain of a transcription factor; and (b) the test compoundis a polypeptide that is provided as a fusion protein comprising thetest polypeptide fused to (i) a transcription activation domain of atranscription factor or (ii) a DNA-binding domain of a transcriptionfactor, to interact with the fusion protein; and (c) binding of the testcompound to the polypeptide is detected as reconstitution of atranscription factor.
 29. An antibody that specifically binds to a GEPpolypeptide of claim
 19. 30. An antibody of claim 29, wherein theantibody is a monoclonal antibody.
 31. A method for identifying anantibacterial agent, the method comprising: (a) contacting a polypeptideencoded by a nucleic acid located within an operon comprising a GEP genewith a test compound; (b) detecting a decrease in function of thepolypeptide contacted with the test compound; and (c) determiningwhether a test compound that decreases function of a contactedpolypeptide inhibits growth of bacteria, relative to growth of bacteriacultured in the absence of a test compound that decreases function of acontacted polypeptide, wherein inhibition of growth indicates that thetest compound is an antibacterial agent.
 32. The method of claim 31,wherein the polypeptide is selected from the group consisting of gep103,gep1119, gep1122, gep1315, gep1493, gep1507, gep1511, gep1518, gep1546,gep1551, gep1561, gep1580, gep1713, gep222, gep2283, gep273, gep286,gep311, gep3262, gep3387, gep47, gep6l, and gep76.
 33. The method ofclaim 31, wherein the test compound is selected from the groupconsisting of polypeptides and small molecules.
 34. A method foridentifying an antibacterial agent, the method comprising: (a)contacting a nucleic acid comprising an operon containing a geneencoding a GEP polypeptide with a test compound, wherein the GEPpolypeptide is selected from the group consisting of gep103, gep1119,gep1122, gep1315, gep1493, gep1507, gep1511, gep1518, gep1546, gep1551,gep1561, gep1580, gep1713, gep222, gep2283, gep273, gep286, gep311,gep3262, gep3387, gep47, gep6l, and gep76; and (b) detecting binding ofthe test compound to the nucleic acid, wherein binding indicates thatthe test compound is an antibacterial agent.
 35. The method of claim 34,further comprising: (c) determining whether a test compound that bindsto the nucleic acid inhibits growth of bacteria, relative to growth ofbacteria cultured in the absence of the test compound that binds to thenucleic acid, wherein inhibition of growth indicates that the testcompound is an antibacterial agent.
 36. The method of claim 34, whereinthe test compound is selected from the group consisting of polypeptidesand small molecules.
 37. An isolated nucleic acid or an allelic variantthereof encoding: a gep1493 polypeptide comprising the amino acidsequence of SEQ ID NO: 13, as depicted in FIG. 5; a gep1507 polypeptidecomprising the amino acid sequence of SEQ ID NO: 16, as depicted in FIG.6; a gep1546 polypeptide comprising the amino acid sequence of SEQ IDNO: 25, as depicted in FIG. 9; a gep273 polypeptide comprising the aminoacid sequence of SEQ ID NO: 46, as depicted in FIG. 16; a gep286polypeptide comprising the amino acid sequence of SEQ ID NO: 49, asdepicted in FIG. 17; or a gep76 polypeptide comprising the amino acidsequence of SEQ ID NO: 67, as depicted in FIG.
 23. 38. An isolatednucleic acid comprising a sequence selected from the group consistingof: (1) SEQ ID NO: 14, as depicted in FIG. 5, or degenerate variantsthereof; (2) SEQ ID NO: 14, or degenerate variants thereof, wherein T isreplaced by U; (3) nucleic acids complementary to (1) and (2); (4)fragments of (1), (2), and (3) that are at least 15 base pairs in lengthand which hybridize under stringent conditions to genomic DNA encodingthe polypeptide of SEQ ID NO: 13; (5) SEQ ID NO: 17, as depicted in FIG.6, or degenerate variants thereof; (6) SEQ ID NO: 17, or degeneratevariants thereof, wherein T is replaced by U; (7) nucleic acidscomplementary to (5) and (6); (8) fragments of (5), (6), and (7) thatare at least 15 base pairs in length and which hybridize under stringentconditions to genomic DNA encoding the polypeptide of SEQ ID NO: 16; (9)SEQ ID NO: 26, as depicted in FIG. 9, or degenerate variants thereof;(10) SEQ ID NO: 26, or degenerate variants thereof, wherein T isreplaced by U; (11) nucleic acids complementary to (9) and (10); (12)fragments of (9), (10), and (11) that are at least 15 base pairs inlength and which hybridize under stringent conditions to genomic DNAencoding the polypeptide of SEQ ID NO: 25; (13) SEQ ID NO: 47, asdepicted in FIG. 16, or degenerate variants thereof; (14) SEQ ID NO: 47,or degenerate variants thereof, wherein T is replaced by U; (15) nucleicacids complementary to (13) and (14); (16) fragments of (13), (14), and(15) that are at least 15 base pairs in length and which hybridize understringent conditions to genomic DNA encoding the polypeptide of SEQ IDNO: 46; (17) SEQ ID NO: 50, as depicted in FIG. 17, or degeneratevariants thereof; (18) SEQ ID NO: 50, or degenerate variants thereof,wherein T is replaced by U; (19) nucleic acids complementary to (i) and(j); (20) fragments of (i), (j), and (k) that are at least 15 base pairsin length and which hybridize under stringent conditions to genomic DNAencoding the polypeptide of SEQ ID NO: 49; (21) SEQ ID NO: 68, asdepicted in FIG. 23, or degenerate variants thereof; (22) SEQ ID NO: 68,or degenerate variants thereof, wherein T is replaced by U; (23) nucleicacids complementary to (21) and (22); and (24) fragments of (21), (22),and (23) that are at least 15 base pairs in length and which hybridizeunder stringent conditions to genomic DNA encoding the polypeptide ofSEQ ID NO:
 67. 39. A method for identifying an antibacterial agent, themethod comprising: (a) contacting a test compound with a polypeptide, ora homolog of a polypeptide, encoded by a nucleic acid sequence locatedwithin an operon comprising a B-yneS gene; and (b) detecting binding ofthe test compound to the polypeptide, wherein binding indicates that thetest compound is an antibacterial agent.
 40. The method of claim 39,further comprising: (c) determining whether a test compound that bindsto the polypeptide inhibits growth of bacteria, relative to growth ofbacteria cultured in the absence of a test compound that binds to thepolypeptide, wherein inhibition of growth indicates that the testcompound is an antibacterial agent.